Emmet - Brixx

Emmet – Brixx    |    Minifigure-Finder

In our showcase we demonstrate the power of Artificial Neural Networks through the specialized recognition and classification of interlocking toy bricks. This project serves as a window into the wide range of possibilities offered by current AI technology, and illustrates just how deeply and precisely machines can “see” today.

Practical Application

This serves as a perfect example to demonstrate the capabilities of AI in the detecting subtle differences and details. In addition, a successful classification could be of great use in practical applications, such as automating sorting and storage processes or supporting assembly instructions.

System Requirements

The system must be able to accurately identify clamp devices in photographs, regardless of their positioning, orientation or lighting conditions.

Classification of the detected blocks

After recognition, the building blocks are to be classified according to their specific shape and size. The system must be able to differentiate different types of building blocks.

Color determination

In addition to classifying the shape of the terminal block, the system must be able to determine the specific color of each detected brick. This includes the ability to detect subtle nuances and gradations.

Linking to building block sets

After classification and color determination, the system should be able to link the identified bricks to the corresponding sets to which they belong. This mapping capability facilitates inventory management and can help users, identify missing or specific pieces in their collections.

Data set

The data basis for our project comes directly from our own carefully created data. With the help of automated procedures we have created a comprehensive dataset of over 650,000 real photos of interlocking toy bricks, covering a total of 600 different classes of bricks. The photos were taken under different conditions and from different perspectives to ensure the broadest possible database. In addition to these real photos, we developed a pipeline that allows us to generate synthetic data sets.

Training

All training steps and data processing are performed in-house. This guarantees that all data remains under our control and is never forwarded to third parties. For the training of our models we rely on our in-house GPU cluster.

Results

Our model achieves an impressive accuracy of 97.5% on our test data set. Accuracy is the ratio of correct predictions to the total number of predictions.

The loss value, also known as the cost function, indicates how far the model’s predictions are from the actual results. In our case, 0.0914.

Precision indicates the ratio of correctly positively predicted elements to the total number of elements predicted as positive.

Recall measures how many of the actual positive elements were recognized as such by the model.

This is the harmonic mean of Precision and Recall and gives a comprehensive value about the quality of the model. In our case a value of 97,04 %.

Optimization

Through targeted adjustments and optimizations in our system architecture as well as the algorithms used, we were able to achieve considerable improvements in the processing speed. Originally, our server-side processing required a total duration of 30,000 milliseconds (30 seconds) per image. With the improvements we were able to drastically reduce this time to only 150 milliseconds.

For our tests, we used images that each contained between 50 and 100 interlocking toy bricks.

Integration of building instructions

During the course of our project, we made a significant advancement: the Integration of the processing of building instructions. This allows us not only to extract valuable additional information, but also to assign each recognized
brick to a specific page and its position on that page.

Future work

One of our primary goals for further development is to optimize the processing speed of our classification network. In its current state, our system requires 47 ms to process an image, which corresponds to a rate of 21 images per second. To enable seamless real-time interaction, we are aiming for a performance improvement to achieve a processing rate of 60 frames per second. Such a level of speed would not only significantly improve the user experience, but also open the door to new application areas such as:

Our ambitious plans rely primarily on our deep expertise in AI. We intend to commercialize this know-how by releasing an app that will be available for all popular mobile operating systems, also developed in-house. In this way we ensure not only the quality of the application, but also a high standard of data privacy.

Minifigures...

This project is an extension of our previous work in the area of terminal building block classification and shifts the focus to minifigure recognition and classification. Through the use of modern AI technologies the project aims to develop a reliable and efficient pipeline capable of accurately classifying minifigures. A particular feature of this project is the ability to identify minifigures that have been can be assembled in different ways. The main goal of the project is therefore not only to identification of such diversely assembled minifigures, but also to design the system in such a way that it can identify the correct construction method and the respective building blocks.

System requirements

Recognition of minifigures

The system should be able to recognize minifigures in a variety of scenarios and lighting conditions.

Classification

Special attention is given to the classification of the torso of the minifigures. This is important for the accurate identification and sorting of the figures.

Finding the associated components

The system must not only identify the minifigure as a whole, but must also be able to assign the specific parts such as the head, arms, and legs that belong to the figure.

Linking minifigures to sets

Another requirement is the ability to automatically link identified minifigures to their associated sets. This could be done through a database that contains information about the affiliation of minifigures to specific sets.

Practical application

Automatic recognition and classification of minifigures is a challenging task that exceeds in its complexity far exceeds that of interlocking toy bricks classification. Minifigures are composed of a large number of different components and can be combined in numerous ways. With over 15000 ( Source: bricklink.com) different minifigures and a of possibilities for the assembly results in an exponentially growing number of combinations, which have to be classified.

Data set

Our dataset includes a diverse collection of 8,000 different minifigures that come from a variety of sources. These include photos from collectors as well as images from publicly available databases and websites. It is remarkable that the dataset originally consisted of only 100 minifigures and grew to its current size within half a year. This broad database allows us to address a diverse range of classification problems while achieving robust and reliable classification performance.

Training

Our powerful infrastructure allows us to efficiently train even sophisticated deep learning networks internally. The combination of high quality data and our state-of-the-art hardware ensures that we can develop accurate and reliable models.

Results

The Mean Average Precision is a metric for measuring an object detection model and includes the following statistics: Confusion Matrix, Intersections over Union, Recall and Precision.

Recognition accuracy is defined as the quotient of the sum of true positive and false negative test results. In our model, we arrive at 99.2 % correct predictions.

The current data set allows the classification of 8,000 different minifigures.

The average processing time for classifying a minifigure is 800 milliseconds. It should be noted that no specific optimization attempts have been made to reduce this processing time. This leaves room for potential improvements in future development phases. The processing is done server-side, which provides the opportunity for continuous updates and improvements.

Future work

Although our results to date are promising, we recognize that there is always room for improvement and further developments. In the upcoming phases of the project, we plan to take the following steps:

We are currently developing an in-house app that will allow users to conveniently use our AI-driven minifigure recognition system via mobile devices. The app development is done exclusively in-house to ensure both control over the project and security of user data.

EMMET SOFTWARE LABS

Emmet Software Labs GmbH & Co. KG
Hertzstr. 6
32052 Herford

Phone: +49 5221-763 999-0         

Email: info@emmet-software-labs.com