AI News

SIGMA: An Open-Source Research Platform Bridging Mixed Reality and AI by Microsoft

In recent years, there has been an increasing interest in exploring the intersection of mixed reality and artificial intelligence (AI) to enable groundbreaking research and innovation. Recognizing the significance of this emerging field, Microsoft AI Research has introduced SIGMA, an open-source research platform that aims to facilitate advancements at the intersection of mixed reality and AI. This article delves into the capabilities of SIGMA, its potential applications, and the benefits it brings to researchers and developers.

Understanding SIGMA

SIGMA, which stands for “System for Integrative Guidance at the Intersection of Mixed Reality and Artificial Intelligence,” is a platform developed by Microsoft AI Research. Its primary objective is to provide researchers and developers with the necessary tools and resources to explore and innovate in the realms of mixed reality and AI.

The platform integrates cutting-edge AI technologies, such as big language and vision models, with mixed reality devices like HoloLens 2. This combination enables SIGMA to offer interactive guidance and task assistance to users, making it an invaluable tool for a wide range of applications.

The Power of Mixed Reality and AI

By harnessing the power of mixed reality and AI, SIGMA opens up a world of possibilities. One of its key features is the ability to guide users through procedural tasks using HoloLens 2. Whether it’s performing complex medical procedures, assembling intricate machinery, or learning new skills, SIGMA can provide step-by-step instructions and real-time feedback to enhance user performance and efficiency.

Left: A person using SIGMA running on a HoloLens 2 to perform a procedural task. Middle: First-person view showing SIGMA’s task-guidance panel and task-specific holograms. Right: 3D visualization of the system's scene understanding showing the egocentric camera view, depth map, detected objects, gaze, hand and head pose.

Left: A person using SIGMA running on a HoloLens 2 to perform a procedural task. Middle: First-person view showing SIGMA’s task-guidance panel and task-specific holograms. Right: 3D visualization of the system’s scene understanding showing the egocentric camera view, depth map, detected objects, gaze, hand and head pose. (c) 2024 IEEE

Moreover, SIGMA leverages big language models, such as GPT-4, to enable open-ended conversations between users and the system. Users can ask questions or seek guidance on a particular task, and SIGMA utilizes its extensive language model to provide relevant and informative responses.

To enhance the user experience, SIGMA incorporates vision models, including Detic and SEEM, to locate and highlight task-relevant objects in the user’s field of view. This feature simplifies complex tasks by providing users with visual cues and guidance, thus reducing the potential for error and improving overall task performance.

The Architecture Behind SIGMA

SIGMA employs a client-server architecture to support its functionality. The lightweight client application runs on the HoloLens 2 device and captures various multimodal data streams, such as RGB, depth, audio, head, hand, and gaze tracking information. This data is then transmitted to a more powerful desktop server, which processes the information and sends instructions back to the client for display.

The underlying architecture of SIGMA is based on the Platform for Situated Intelligence (psi), an open-source framework developed by Microsoft Research. This framework allows for the development and research of multimodal integrative AI systems, providing support for fast prototyping, data-driven development, and visualization.

Potential Applications of SIGMA

SIGMA’s capabilities make it a versatile platform with numerous potential applications. Here are a few examples:

1. Mixed Reality Task Assistance

SIGMA can act as a virtual guide, providing real-time instructions and feedback to users performing complex tasks. This can be particularly useful in domains such as medicine, manufacturing, and construction, where precise and accurate task execution is critical.

2. Training and Education

With its ability to offer step-by-step guidance and interactive conversations, SIGMA can be a valuable tool in training and education settings. From teaching complex procedures to assisting students in their learning journey, the platform has the potential to revolutionize the way we acquire knowledge and skills.

3. Augmented Collaboration

SIGMA can facilitate collaboration between humans and AI systems in a mixed reality environment. By providing real-time feedback, highlighting task-relevant objects, and enabling interactive conversations, the platform enhances the collaborative capabilities of teams working on complex projects.

4. Personalized Assisted Living

In the realm of healthcare, SIGMA can assist individuals with personalized guidance and support. From aiding patients in following treatment plans to helping elderly individuals perform daily tasks, the platform can improve the quality of life for many.

Advantages of an Open-Source Research Platform

By open-sourcing SIGMA, Microsoft AI Research aims to foster collaboration, accelerate research, and encourage innovation at the intersection of mixed reality and AI. Here are some key advantages of an open-source research platform like SIGMA:

1. Accessibility and Transparency

By making SIGMA freely available to researchers and developers, Microsoft AI Research ensures that anyone can access and utilize the platform’s capabilities. This fosters inclusivity, and transparency, and encourages community-driven development.

2. Collaboration and Community Building

An open-source platform allows researchers and developers from different backgrounds to collaborate and share their insights, contributing to a vibrant community. This collaborative environment can lead to accelerated progress and breakthroughs in the field.

3. Customization and Extensibility

With access to the platform’s source code, developers can customize and extend SIGMA’s functionalities to suit their specific research or application needs. This flexibility enables researchers to explore novel ideas and push the boundaries of what is possible.

4. Feedback and Improvement

The open-source nature of SIGMA encourages users to provide feedback and contribute to its development. This feedback loop facilitates continuous improvement and ensures that the platform evolves to meet the changing needs of the research community.

In conclusion, SIGMA, the open-source research platform developed by Microsoft AI Research, brings together the realms of mixed reality and AI. By providing interactive guidance, real-time feedback, and open-ended conversations, SIGMA opens up exciting possibilities for research, innovation, and collaboration. With its potential applications in various domains, SIGMA has the power to transform how we interact with AI systems and augment our capabilities in the real world.

Check out the Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on LinkedIn. Do join our active AI community on Discord.

Explore 3600+ latest AI tools at AI Toolhouse 🚀.

Read our other blogs on AI Tools 😁

If you like our work, you will love our Newsletter 📰

Ritvik Vipra

Ritvik is a graduate of IIT Roorkee with significant experience in Software Engineering and Product Development in core Machine Learning, Deep Learning and Data-driven enterprise products using state-of-the-art NLP and AI

Leave a Reply

Your email address will not be published. Required fields are marked *