ATLANTA (WSAV) – Lead by Assistant Professor of Physics and Astronomy, Siding Lei, Georgia State University researchers have designed a new type of artificial vision device that includes novel vertical stacking architecture and allows for greater depth of color recognition and scalability on a micro-level.  The main purpose of the research is to develop a micro-scale camera for microrobots that can enter narrow spaces that are intangible by current means, and open up new horizons in medical diagnosis, environmental study, manufacturing, archaeology, and more.

WSAV’s Hollie Lewis had the opportunity to ask assistant professor of Physics Sidong Lei questions about the research.

Sidong Lei, Professor of Physics at Georgia State University, led the research of designing a new type of artificial vision that allows for great depth of color recognition.

What led to you and your team starting this design and how long did it take you and your team from start to finish?

The device prototyping and characterization took more than half a year, but we devoted a large amount of effort to finding the proper semiconductor materials and investigating the feasibility. And this idea was originally generated from a discussion with my Ph.D. advisor, Dr. Pulickel Ajayan at Rice University, and he continuously encouraged me to pursue this direction. In that sense, we have prepared for this project for several years.

How serious was the need for the artificial vision device and why?

Comparing simply image capturing, artificial vision takes one more step forward toward automatic information processing and decision making. Let us still use medical diagnosis for example. Currently, doctors use endoscopes for diagnosis and make decisions on the following procedure. In other words, the information is processed by our brain. Although intelligent, problems of delay or misjudgment are still unavoidable. In contrast, artificial vision devices can interpret the image by themselves and make instantaneous and consistent responses.

What is a novel vertical stacking architecture?

Human eyes have three types of vision cells to sense red, green, and blue lights. All other colors, such as orange, yellow, cyan, and many more, are composed of red, green, and blue. The cameras follow the same principle to take color images, thus requiring three color sensing elements to detect the red, green, and blue light, respectively. In conventional image sensors, these elements are typically arranged side-by-side in a chessboard-like pattern on the same plane, therefore being space-consuming. In our design, instead of using the above lateral layout, we stack these red, green, and blue units in the vertical direction, and therefore, same up to 75% of the sensor volume.

Aisha Okmi was a leading author who performed the majority of the experimental jobs including material synthesis, data collection and analysis. She also drafted the manuscript. Photo provided.

Please share more about the greater depth of color recognition the device has (what colors are recognized, can it recognize the colors faster etc.)?

Image sensors with the vertical RGB structure are currently commercially available, and people call them Foveon sensors. Then the question turns to why we want to develop something that appears similar. This is because our structure can bring many benefits, including a faster speed, as you said.

To understand these benefits, we need to first explain the principle of Foveon sensors, which is indeed intelligent in the aspect of electrical engineering. When a light beam is shone vertically into a silicon slab, red, green, and blue colors can be automatically separated along the light propagation direction. This is very similar to the gradient color varying from orange to blue in the dawn sky, although the physics principles are not exactly the same. This is how a Foveon sensor performs the color sensing function. But the drawback is also clear that the “RGB” seen by the silicon are different from the ones recognized by human eyes. Thus, sophisticated mathematical calculation is needed to correct the error, and this job is typically done by a CPU in our camera or cell phone, consuming time and energy.

In contrast, we have three different types of new semiconductor materials in our design, and each of them can see the same red, green, or blue colors as our eyes, so that we can omit the afterward sophisticated correction process but deliver more vivid colors.

Why is the scalability on a micro-level important?

For image sensors, the “micro-level” refers to the size of each pixel, and scalability means how many pixels we can integrate into each image sensor. By shrinking the individual pixel size, we can insert more pixels in a given space, and in turn, render better image resolution. Currently, we have demonstrated the feasibility to construct smaller pixels, and our long-term goal is to implement the large-scale integration of our vertical stacking pixels, for example, 4000 X 3000, for high-definition image capturing.

How does color recognition and micro scalability help in the areas of medical diagnosis, environmental study, manufacturing and/or archaeology?

Ningixin Li gave an oral presentation on the research Material Research Society. He was also a leading author who performed the majority of the experimental jobs including material synthesis, data collections and analysis. He also drafted a manuscript.

All these mentioned areas have some common features: requiring precise operations in narrow and complicated spaces. Therefore, high-quality image capturing turns out to be critical. Take medical diagnosis for example, endoscopy and minimally invasive surgery both need miniature cameras that can fit into the human body. From the colors shown on images, we can identify lesions from normal tissues. Therefore, the more precise a camera can recognize the color, the more precise the diagnosis will be.

However, as mentioned early, the current chessboard-like color recognition structure is bulky. Therefore, people have to reduce the pixel number, for example, to 200 X 200 pixels, in order to fit cameras into the human body. We hope our effort can deliver a better system than the state-of-the-art.

What field of study do you think the microrobots will be the most beneficial in and why?

Similarly, microrobots will find applications in medication, environmental science, and other fields listed above. Still taking medication, for example, the current minimally invasive surgery still places considerable risks, such as bleeding and infection, on patients, because we still need to manually insert and manipulate medical instruments into the body through cuttings. 

In contrast, we expect that in the near future microrobots can enter our bodies and automatically perform operations, such as sampling, testing, resection, and drug delivery, even in an out-patient manner. This will be particularly attractive for the treatments of cancer, cerebral thrombosis, and other diseases that challenge the current medical methods. Other than that, microrobotics may also find applications in universe exploration, considering that we can shrink a Mars rover to millimeter sizes.

The implementation of all these possibilities depends on the miniaturization of all functional units in these microrobots, including CPU, battery, actuator, wireless signal transceiver, and of course, the image sensors, which give these tiny machines the vision.

You have mentioned that the high-quality color sensing and image recognition function may bring new possibilities of colorful item perception for the visually impaired in the future.  Could that be on your agenda next?  How close are we to that actually being a possibility or happening?

Nearly all of our camera systems are designed to mimic human eyes, in aspects of the lens structure, color recognition principle, the function of purple, etc. The development of cameras is actually a process that re-understands our eyes. We hope that our effort can bring a new approach to helping the visually impaired, leveraging the more compact design of our system. But we should admit there is still a long way to go, particularly in the area of the machine-brain interface, because we need to translate the electrical signal generated into neural stimuli that can be understood by our brain. I am not an expert in this area, but I hope our work can inspire the expertise in relevant areas and initiate collaborations in this direction.