Emil Nielsen (eniel16@student.sdu.dk) & Kasper Schultz Davidsen (kdavi16@student.sdu.dk)

Source code on Bitbucket

 

1 Introduction

 

1.1 Reading guide

In order to understand this blog, the reader is expected to know the basics of augmented reality technology. Having used apps with augmented reality before will certainly be a help. For the later parts of the blog, knowledge of ARKit, SceneKit, and UIKit is required.

Whenever classes from the solution are referred in the text, their names will be written in bold text.

1.2 Executing the source code

Because this project uses CocoaPods, it is important to not open the Xcode Project file, but instead to open the Xcode Workspace file from the repository. Furthermore, when opening the project, the “identity” and “signing” settings on the AR Programming project may be incorrect and prevent the user from building the code. This can be fixed by using a custom Bundle Identifier and selecting one’s own Team, as highlighted below:

1.3 Problem domain

The demand for IT specialists is greater than ever before, and this need will only continue to grow in the future. Even so, it is a challenge to attract enough skilled people to fields such as computer science, software engineering, and robotics. One explanation could be that most people do not know what it means to work in the IT industry. For this reason, they will not even consider such an education. Some even feel intimidated by these fields because they know little to nothing about them.

The solution is simply to educate children about the very basics of IT, just like they are educated in languages, mathematics, biology, medicine, sports, geography, and many more fields in school. Part of this is giving children a basic understanding of the concept of programming. And one need not even write a single line of code to understand this, as this project aims to prove.

As a result, many industry professionals are teaming up to provide educational offers to school children. We also witness a growing number of tools for learning programming in playful ways. These tools all have one thing in common: a medium to high price tag. This might have a deterrent effect to getting into the field.

1.4 Idea

This project is about developing an application for iOS that allows children to explore the field of programming in an engaging and interactive way. As such, augmented reality will be used to draw children out of the chair. The solution will also focus the experience around cheap, physical objects. This will allow the users to cooperate more freely than what would be possible around a keyboard.

The user of the app has six physical and distinct cards referred to as programming cards. The user can construct programs by laying out these cards in order on a flat surface. Using the camera on an iOS device, the user can scan this sequence of cards in real time and track changes to it. Each card is associated with an action that can be performed by a virtual robot, which will be projected onto the flat surface by the app. On the user’s request, the app can run the program created by the cards, thus moving the robot around. The app will include various exercises that require the robot to move in specific patterns, thus challenging the creativity of the user to come up with the correct sequences of cards. This is the very core of programming!

Through these challenges, the user is expected to learn about program statements, loops, and simple algorithms.

2 Methods and materials

This section will describe all the innovation and qualification tools used to make this project a reality.

2.1 Brainstorming

The project work began with brainstorming in order to come up with ideas for an app. The process was rather informal and carried out in textual form. The goal of the brainstorming session was to come up with an idea that both group members found interesting. The idea should not be too complicated to implement, although we also wanted to challenge ourselves and preferably try out new, state of the art technology.

2.2 Investigating similar solutions

After having come up with a basic idea, we investigated similar products to gather inspiration on functionality and design. It is of utmost importance that our app has a high educational value for the user, so investigating what similar projects do right was very helpful.

2.3 Prototyping

Before implementing the app in code, we tried out various prototype designs. This was to ensure that we did not waste time programming the UI multiple times for different design ideas. The prototype designs, on the other hand, were easy to change. These designs were created in tools such as Lucidchart for simple diagramming, and MarvelApp for interactive designs with navigation.

2.4 Evaluation

Since the prototype is only UI and not functionality, the evaluation of it will be based on which prototype is most user friendly. Also, the fulfillment of the iOS design guidelines and if the UI elements are of native iOS design will be evaluated. For the implementation, we will evaluate the frameworks used and if we have used the frameworks correctly. The classes will be evaluated if they are used as one of the three categories: model, view, or controller. The assigned responsibility of the classes will also be evaluated, that is, if the classes have methods and properties that achieve low coupling and high cohesion. The class evaluation will make the basis of the architectural evaluation, that will evaluate how well the architecture fulfills the Model-View-Controller (MVC) design pattern.

2.5 Use case diagrams

Use case diagrams have mainly been used as a documentation tool, and not so much as part of developing the app. A use case diagram has been developed for each of the main views in the app. They can be used to quickly get an overview of what the app is capable of. Thus, it has been possible to compare the use case diagrams to the idea described in section 1.3 and the brainstorm described in section 3.1 to ensure that the app lives up to our expectation.

2.6 Technologies

A major part of the application is the use of augmented reality, which has been realized using ARKit 2 developed by Apple. This framework makes use of the device’s camera and motion sensors to track surfaces in the real world and create so called anchors in a virtual, three-dimensional coordinate space. The use of the camera and motion sensors together allow anchor points to persist in the world, even when the surface, to which they are attached, goes out of view.

In order to work with the coordinate space created by ARKit, we have used Apple’s SceneKit. This is a library for creating three-dimensional models and managing their transformations using vectors and matrices. Thus, when an anchor point has been created by ARKit, a model from SceneKit can be attached to this anchor, and will thus appear to “exist” in the real world.

Core Data has been used to a small extent for persisting user-specific data, but the majority of the app’s data is persisted using the Codable protocol. Lastly, AudioKit has been included for sound effects.

2.7 Architectural design

It has been important for us to design and persist a proper architecture for the app. Thus, we wanted the elements of the application to be loosely coupled, and we also wanted to split functionality into smaller units to make the code easier to understand. We have attempted to adopt the model-view-controller (MVC) pattern, because this seems to be the standard architecture for iOS apps. As this application is rather heavy on the domain side, we expect to get a lot of models compared to a smaller number of views and controllers.

Whenever a controller became too large, it was important for us to split it up into smaller controllers. This could, for instance, be done by dividing a view into smaller, cohesive units, each with a separate controller. The next section will describe how this was performed in practice.

3 Results

This section will discuss the results of the methods introduced in section 2, as well as an explanation of some of the most central source code.

3.1 Brainstorming

During the brainstorm we came up with two ideas. The first was an app that used an external API. The app would make a request and get a JSON response with GPS coordinates and other properties of e.g. parking places. This JSON would then be used to create objects that the user could see on a map, this way the user could see available parking places.

The second idea was to work with AR. We came up with the app idea described in section 1.3 We chose to work with this idea because it provided a large learning outcome. We wrote down a list of requirements to be included in the app listed by priority. Requirements in italics and parentheses are possibly out of scope for this project, but they are interesting for future work.

    • Detect various programming cards on a table
    • Program a robot according to this sequence of cards
    • Inform the user about the functionality of each card in focus
    • In-app menu to print available cards
    • (In-app tutorials that teach the user about the different cards)
    • (Allow the user to install libraries with cards that extend the app’s functionality)

3.2 Investigating similar solutions

We found two applications that resembles our app idea. The first is Code With Blocks. This app introduces a user to basic programming with blocks, like branching, variables and loops. This enables the user to combine these blocks to create a game [source].

The second application is the doodle Coding With Carrots. This doodle uses labels, that represents an action, to makes a rabbit move. The purpose of the game is to complete levels by combining labels and make the rabbit collect a predefined number of carrots. This sequence of labels is assembled by using drag and drop [source].

Our app idea differs from these applications by using AR and physical cards. Instead of having a character on the screen, we intend to have a robot that appears to be in the real world by using AR. Our program will also differ by scanning physical cards to program the robot instead of blocks and labels. Our combination makes the app more physical than the other apps. The app idea is also great for collaboration, since multiple pupils can work together and solve programming problems by combining cards, where they can see the movement of the robot on their own device.

3.3 Prototyping

The first prototype uses the tab bar for navigating between different sections. The first section displays the game play and the second displays a guide how to play. This design satisfies the iOS design guidelines and is easy to implement. The cons of this design is when the user plays the game, a tab bar will always appear and cover some of the screen.

AR Programming visual prototype 1
First visual prototype

The second visual prototype attempted to find an alternative to tab views. Instead of having easy access to the various views in the bottom of the screen, the less important views would be hidden in a side bar, also known as a navigation drawer. Navigations to important views would be visible on the screen at all times. For instance, in the play view below (middle), there is easy access to the card-information view (right). But access to views such as the card library or a view for printing playing cards have been hidden in a navigation drawer (left). The main purpose of this design was to put as few buttons on the screen as possible so the user is not distracted.

AR Programming visual prototype 2
The second visual prototype

Based on these visual prototypes, some interactive designs were created using MarvelApp. These designs allow for the user to press areas of the screen to navigate between the views. The first interactive design centered on the use of the navigation drawer. However, it was found that having to open the navigation drawer was too tedious for users, and many experts acually recommend not to use navigation drawers at all [32:00-38:04].

The first interactive prototype can be found here.

AR Programming visual prototype 3
Third visual prototype, also interactive

Another interactive prototype attempted to see how users reacted to using a tab view. The tab in the bottom actually does not take up as much space of the view as we were concerned about, and it was generally easier to navigate around. Other changes were a smaller information panel in the card information view (right), and the addition of a level selection view (the two views on the left). These views provide alternative ways that the level select screen could be designed. The second design was chosen, because being able to see the camera throughput was too distracting while inside level select.

The second interactive prototype can be found here.

AR Programming visual prototype 4
Fourth visual prototype, also interactive

3.4 Design evaluation

We chose to work with the prototype that uses the tab bar over the navigation drawer, since the navigation drawer is not part of any iOS UI elements provided by Apple. Apple do not recommend navigation drawers and they use twice as many tabs to navigate around.

We have implemented the three most important requirements mentioned in the brainstorm section. We have used the ARKit framework to fulfill these requirements, that is detecting cards, show information about the detected card to the user, and drawing a robot in the world. We did not implement the requirement of printing the available cards on paper, as this is a lesser important requirement and we think it has a lower learning outcome than the three first requirements. A tutorial section is not important for the learning outcome, but will be important when the app is used by other users. Allowing a user to install libraries has not been implemented either as we did not have the time, but it would given a great learning outcome.

3.5 Use case diagrams

Below are the use case diagrams that have been created. They sum up the most essential functionality available in AR Programming. It is apparent from these diagrams that all features discussed in section 1.3 and 3.1 have been implemented, with the exception of printing cards.

Use case diagrams for AR Programming
Use case diagrams

Furthermore, some additional capabilities have been added in the form of card-highlighting (which gives visual feedback on the cards that ARKit has detected), a level select view, and the ability to unlock levels by completing previous levels. We have also developed a separate view for browsing the entire library of cards and reading their details. However, we did not make a use case diagram for this view, since it was rather trivial.

3.6 ARKit

The use of ARKit is seen primarily in the ARController class in the controller layer of the application. When ARKit is configured, it is possible to add the ARController as a delegate to an instance of ARSCNView. This delegate implements the protocol ARSCNViewDelegate, and can receive callbacks when ARKit places, removes, and updates nodes on anchors in the ARKit coordinate system. Furthermore, it is possible to add ARController as a delegate to an ARSession referenced by the ARSCNView. This delegate implements the protocol ARSessionDelegate, and can receive callbacks when ARKit updates the visual frame or adds, removes, or updates the anchors themselves.

This has for instance been used when ARKit detects a playable surface or one of the physical cards, in which case it adds SCNNodes on the corresponding anchor points. We can detect the types of these anchors:

func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {
    if let imageAnchor = anchor as? ARImageAnchor {
        handleCardDetected(imageAnchor: imageAnchor, node: node)
    } else if let planeAnchor = anchor as? ARPlaneAnchor {
        handlePlaneDetected(planeAnchor: planeAnchor, node: node)
    }
}

In the case of detecting images, we have configured ARKit with the set of images to look for. When ARKit adds an ARImageAnchor, it contains a reference to the concrete image it has detected. We use the name of this image to look up the corresponding card within the app:

private func handleCardDetected(imageAnchor: ARImageAnchor, node: SCNNode) {
    let referenceImage = imageAnchor.referenceImage
    ...
    if let cardIdentifier = Int(referenceImage.name!) {
        if let card = cardMapper?.getCard(identifier: cardIdentifier) {
            cardWorld.addCard(plane: plane, card: card)
        }
    }
}

So in conclusion, ARKit is implemented by adding delegates to ARSCNView and ARSession. After configuring ARKit, it makes callbacks to our code with easily handled information.

The use of SceneKit is rather trivial, and will not be discussed further.

3.7 AudioKit

This framework has been implemented using its own class called AudioController. The purpose of this class is to hide some of the implementation details of loading and activating sounds for AudioKit. This class is a singleton, because there is only a single instance of AudioKit per application. This design is inspired by the recommendations from the official AudioKit examples. In order to allow other classes in the app to load sound files in a single line of code, the following method was implemented:

public func makeSound(withName name: String) -> AKAudioPlayer? {
    if let soundFile = try? AKAudioFile(readFileName: name) {
        if let player = try? AKAudioPlayer(file: soundFile) {
            player.looping = false
            AudioKit.output = player
            
            return player
        }
    }
    
    print("Error: Could not read sound file with name \(name)")
    return nil
}

After loading a sound file, a client can start AudioKit, which enables playback of the AKAudioPlayer returned from the method above. This class has been placed in the controller layer of the app. This is because sound can be seen as an audible interface to an application, and thus it does not belong in the model layer. One might even argue that sound belongs to the view, but it is nevertheless managed by the controllers.

3.8 Design of the primary view controller

In terms of implementation, especially one view in the application became rather complicated. This was the view used for the augmented reality experience, which must handle things like image detection, user input, and forwarding calls to the active level. Level is a class used to encapsulate the state of the currently active exercise. Placing all of this logic in one class would be asking for trouble, so we ended up with a separation as that illustrated below.

AR Programming Game Controller
Central design classes of the main game view

3.8.1 Delegation of responsibilities

In the center we have GameViewController. This class is primarily used for forwarding data received from other view controllers during segueing, as well as binding the separate parts of the view together. It delegates most of the responsibilities for the view to ScanViewController and LevelViewController. ScanViewController is rather simple, in that it merely receives callbacks when a new card has been detected, and displays the details of that card on the screen.

LevelViewController has the responsibility of letting the user choose a playing surface, detect sequences of cards, executing the program on the robot, and resetting the level’s state. This requires it to know about a few other classes in the design diagram above.

Everything related to augmented reality is handled by the class ARController. Despite its name, this is not a view controller. It is possible to start and stop AR-related functionality to save battery when the user navigates to other views. The ARController uses the CardMapper protocol implemented by Level to determine the functionality of the cards that it detects (cards vary from level to level). It stores these cards along with their positions in a CardWorld object. When cards or surfaces are detected, it makes callbacks to its delegates represented by the protocols PlaneDetectorDelegate and CardScannerDelegate. This allows for a more decoupled design.

ARController was created because of a limitation in the ARKit and SceneKit frameworks. It is only possible for an ARSCNView object to have one delegate, so splitting the logic of detecting cards and surfaces was not easily possible. We therefore decided to keep it in a single class, and then use delegates for providing the responses to these detections. This is heavily inspired by the way UIKit is designed, where controllers use delegates to determine how they should behave in certain situations.

3.8.2 Lack of navigation in the game view

When using the application, one might have noticed that navigating from the main game view to the card scanning view does not make use of a navigation controller, but instead tries to simulate this functionality. This is because navigating to a different view would require setting up a new ARSCNView, which is both resource intensive and also hides the level that is visible from the game view. Furthermore, it prevents world tracking by ARKit in the game view, meaning that the level might be out of place when returning from the card scanning view.

For this reason, GameViewController selectively hides and displays the UI elements used by the LevelViewController and ScanViewController respectively when “navigating” between them. This provides the illusion of moving between views, while maintaining the same ARSCNView for both playing the game and scanning cards.

3.8.3 Referencing child view controllers

One challenge with separating the game view into a LevelViewController and ScanViewController was getting a reference to these controllers from within GameViewController. However, since these child controllers are placed inside container views, the prepare-for-segue method on GameViewController is called with these child view controllers upon initialization. Thus, getting references to them was this simple:

override func prepare(for segue: UIStoryboardSegue, sender: Any?) {
    super.prepare(for: segue, sender: sender)
    
    //Use this trick to get access to the controllers of the child views
    switch segue.destination {
    case let levelController as LevelViewController:
        self.levelViewController = levelController
    case let scanController as ScanViewController:
        self.scanViewController = scanController
    default:
        break
    }
}

3.9 Persisting levels

In order to support a dynamic selection of levels in the app, we decided to store levels on the user’s device. This will allow us to update levels and add new ones without changing the app itself. First, we needed to choose a location to store the level files. We settled on the application support directory, which according to Apple should be used for:

any files that are not user data files. […] Use the Library subdirectories for any files you don’t want exposed to the user. Your app should not use these directories for user data files.” [source]

The user should have no interest in reading or even adapting the level files, in which case this directory seems most fitting.

We decided to use the Codable protocol for storing Level objects in JSON format. The first step was to make Level implement this protocol, which requires all its attributes to do the same. This was also easily done for TileMap. However, the dictionary of Card objects stored in Level objects gave issues, since the Card struct references an instance of the CardCommand protocol. When storing instances of this protocol, we lose information about the concrete implementation, and thus we cannot load the Card objects back in along with their commands.

We solved this by implementing the codable constructor and the encode method ourselves, along with the CodingKeys enum. Now having control over how Levels are serialized and deserialized, we decided to serialize Card objects using only their names. All other attributes can be saved with no further changes:

func encode(to encoder: Encoder) throws {
    var container = encoder.container(keyedBy: CodingKeys.self)
    
    try container.encode(name, forKey: .name)
    try container.encode(levelNumber, forKey: CodingKeys.number)
    try container.encode(unlocks, forKey: CodingKeys.unlocks)
    
    var encodeCards = [Int:String]()
    for (index, card) in cards {
        encodeCards[index] = card.name
    }
    try container.encode(encodeCards, forKey: .cards)
    
    try container.encode(tiles, forKey: CodingKeys.tiles)
}

When deserializing Card objects, we pass this name to a CardFactory instance that can map between names of cards and their actual object, complete with a correct CardCommand object:

required convenience init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
    
    let name = try container.decode(String.self, forKey: .name)
    let number = try container.decode(Int.self, forKey: CodingKeys.number)
    let unlocks = try? container.decode(String.self, forKey: CodingKeys.unlocks)
    self.init(name: name, levelNumber: number, unlocks: unlocks)
    
    let decodeCards = try container.decode([Int:String].self, forKey: .cards)
    for (index, cardName) in decodeCards {
        cards[index] = CardFactory.instance.getCard(named: cardName)
    }
    
    self.tiles = try container.decode(TileMap.self, forKey: CodingKeys.tiles)
}

Initially, all levels but the first one are locked. The user unlocks levels by completing the most recently unlocked level. In order to store information about what levels the user has unlocked, we decided to use Core Data. This framework does not support predefined data for when the app is installed on the user’s device, which is why it was not used to store level information. But it works well for storing data that gets generated while the app is used, such as unlocking levels.

The code snippet below shows how Core Data was used to determine if a specific level is unlocked. All levels can be identified by a number. If a level with that number is stored with Core Data, it contains an unlocked attribute of the type Bool. If the level could not be found in Core Data, we assume that the level has not been encountered yet, and thus it must be locked. Initially, only the first level is stored in Core Data, and new entries are added as the user completes more and more levels. Because of this design, it is programatically possible to lock a level that was previously unlocked.

private static func isLevelUnlocked(withNumber levelNumber: Int) -> Bool {
    let request = NSFetchRequest<LevelEntity>(entityName: "LevelEntity")
    request.predicate = NSPredicate(format: "level = %d", levelNumber)
    if let result = try? managedObjectContext.fetch(request) {
        if result.count != 0 {
            return result[0].unlocked
        }
    }
    return false
}

3.10 Implementation evaluation

Classes have been divided into model, view, and controllers. Our models can communicate with each other, but they do not make calls to controllers. They have no properties or methods for UI and they only implement business logic. We would have liked to better encapsulate the state of the Level model, because in its current state the controller is responsible for storing both the level, and the playing scene individually. This puts a lot of responsibility into the controller for resetting levels – something that should be handled by the model alone.

The controllers talk to our models and present their information in the view. Furthermore, in order for a model to inform a controller about updates, we have decided to use delegation. For instance, we have a LevelDelegate protocol with functions for when the level is completed, reset, or collectibles are taken by the robot. The controller sets itself at the Level‘s delegate and updates the view when these methods are invoked.

Our views respond to the users’ interactions, which makes calls to the controllers. For instance when a user taps a UIButton an IBAction func is executed. Overall, we think we have used and implemented the MVC design pattern correctly.

Exception handling has not been a major focus in this app. We have attempted to prevent the app from crashing, but when an exception occurs it will generally be ignored. This is because for many of the exceptions we cannot do something to handle them, so we might a well allow the rest of the app to continue working, albeit with slightly less functionality. Such exceptions might be missing images or other files. We have not experienced any problems with this during the project.

4 Discussion

The project that was handed in on November 29th successfully implemented scanning of programming cards on a flat surface, programming of a virtual robot based on these cards, and the ability for the user to see the functionality of cards in focus. Furthermore, we implemented playable levels that can be stored on the user’s device, as well as a view for choosing levels to play. Lastly, we implemented a view for accessing the library of available cards along with their images and descriptions.

Within the time allotted to this project we were unable to implement a menu for printing cards to the user. We also did not implement in-app tutorials and the installation of new libraries. These latter two points were also marked as out of scope for this project, and can be the basis for further work.

4.1 Usability

Optimally, the application should have been tested by its target group, which is school children from 1st to 4th grade. Within the scope of this project, however, that was not possible. Instead, the application has been evaluated by other university students for its usability.

It was found that managing the sequence of physical cards at the same time as holding the iPad can be a little tricky. The user must hold the iPad with one hand and move the cards around with the other. When working together, it was easier for users to delegate the work of moving cards and holding the iPad. This is an issue that is worth looking into for future work, as it might reduce the learning outcome of the user who does not move around the cards.

Furthermore, having to perform surface detection every time you start a new level was tedious for users. It did not become a big problem for the mere 3 levels we have made available, but for many levels it could become an issue. This is not helped by the fact that surface detection can be difficult on flat, sterile surfaces, or in bad lighting conditions.

So overall, the app has some usability issues that will need to be solved before this application can become successful.

4.2 Further development

For further development we would have more focus on user-centered design. We would do this by working with users that will test our app. This user test could be done with school children. We would test the app by giving them tasks to perform using the app. We would also implement more functionality in the app. This would include more levels, tutorials to guide the user, more complex cards such as branching or loops, more types of robots. Perhaps also multiplayer, where all users can connect to the same game and see the robot on their iPads. Also, a highscore system could be implemented, where the score for each level is saved in Core Data. The possibilities are endless.

We would also like to improve the algorithms for detecting the sequence of cards. In its current state it contains a bug. This bug means that once a sequence has been detected, the app is very reluctant to allow changes to this sequence, forcing the user to reload the level. This bug has not been fixed because of time constraints.

4.3 Conclusion

The purpose of this project was to go from an idea to a functional app. This was done by going through different phases. First we did brainstorming to find an idea and figure out the requirements for this idea. We then made a prototype using online tools. The navigational prototype was used as a basis for implementing a prototype of the GUI in code. Lastly, we built upon the GUI prototype and implemented the most important requirements. Overall, we fulfilled the purpose of the project that is to go from an idea to a functional app.

5 References

These are not specifically referenced from the blog post, but the sources have provided valuable information for creating this project.

5.1 ARKit

5.2 AudioKit

5.3 Core data

5.4 Miscellaneous

Leave a Reply