• Camera to recognize shapes under angle, then virtually 3D model them, is such stuff possible?
    8 replies, posted
Hey, so long story short, I just started learning AI in college and it sure seems like an interesting stuff. This is probably still a futuristic-science stuff, but just in case... I've done 3D modelling before for a good few years, I am doing java and many other languages for past two years, and now next year I will have final year project for my bachelors degree, and it's a solo project and usually these projects are pretty mental (complex). My idea was: Build a small robot/robotic arm with camera which would have camera (high res, I plan to spend up to 250 Euros for camera) - which would read objects then take each object and place it in 3D Virtual environment (not visible to user), once it finishes scanning objects and 3D environment would be built, it would run physics simulation to see if the action robot is about to perform would cause destruction or not. Now I know it's probably futuristic stuff but well, just in case. I know I would require fast CPU power for this, but pretty sure I will be able to come up with something. So I had few possible cases in my mind: Case 1) [B]IF:[/B]Objects that are stacked together are all same. [B]THEN: [/B]User needs to pre-setup robot, specify dimensions of object (only square shapes, nothing else to be used), then use some kind of ultra sound emitter to calculate proximity then (I assume) camera would be able to faster capture object. [B]END: [/B]Objects are built in virtual 3D environment, simulated, the robot would test for objects removing, once it successfully removes object (in 3D environment) it will pass the test and remove that object in real world. Case 2) [B]IF: [/B]Objects that are stacked are all different (but still squares). [B]THEN: [/B]User can specify specific area on screen which robot should focus on, so only that area gets modelled and simulated. [B]END:[/B] Objects are built and removed the same way as above.[B] [B]ASSUMPTION:[/B][/B] Higher chance of collapse, since it only tests for specific area, it does not count any other stress from other areas, but good side (I think) would be much faster.Case 3) Case 3)[B] [B]IF: [/B][/B]Object corners are labeled/marked.[B] [B]THEN: [/B][/B]This could really increase performance of modelling as robot will understand where are the corners. [B] [B]END: [/B][/B]Same as above happens.Obviously all these cases can be mixed/modified. [B] [B] In short: [/B][/B] Aim: Have a robot/arm with camera to analyze a stack of cubes/squares, then virtually model the environment, and run physics simulation, once the object removal passes in virtual env. without collapsing whole stack, it will remove object in real life.[B] Image: [IMG]http://images.teamsugar.com/files/upl1/10/109609/16_2008/unclegoose.jpg[/IMG] [/B]Basically like that, ofcourse less cubes, could be around 4 or 5, depends. So I was wondering is this possible at all yet? or is there no such CPU that would be capable of doing all this? Thank You. [B] P.S - [/B]facepunch B tags seem to be pretty fucked up for me.
[video=youtube;Oie1ZXWceqM]http://www.youtube.com/watch?v=Oie1ZXWceqM[/video]
[QUOTE=MaxOfS2D;42148522][video=youtube;Oie1ZXWceqM]http://www.youtube.com/watch?v=Oie1ZXWceqM[/video][/QUOTE] Never knew something like this is possible to be done so quickly, but that involves human interaction though but I guess I could make it do automatically after I study for some time into this stuff. What about CPU's though? I mean I've seen some crazy stuff before where robots were playing football and it's not like it took them an hour to calculate where the ball is rolling, so I assume that is possible then?
[QUOTE=MaxOfS2D;42148522][video=youtube;Oie1ZXWceqM]http://www.youtube.com/watch?v=Oie1ZXWceqM[/video][/QUOTE] Haha that's completely unrelated to what OP needs. This thing is great but it's certainly not adapted to real time applications (aka a robot). I've actually studied a bit of robotics and computer vision for a year, I don't remember much of it because I personally didn't find it all that interesting but we did make a robot which would locate coloured balls and move towards them. First of all, you have to know that you will almost never work with color images, because they're extremely hard to process in that state. You'll usually work with shitty monochrome images calculated from the color image in such a way that only the relevant information stands out. So all the computer will see will be a few white blobs on a completely black background. For instance, if you want to follow a red ball around, you'd take the red channel of the image captured by the camera, eventually subtract the green and blue channel to make sure it's actually red and not yellow or white, and then [url=http://en.wikipedia.org/wiki/Thresholding_(image_processing)]binarize[/url] this image. From there, you'd need to locate the ball in that whole mess of pixels (that's usually done by filtering the image to eliminate all the unneeded garbage), and that'd tell you roughly where the ball is. Now if you're going to work with cubes, you'll probably need a clear marking on the cubes so they show up on the binarized image. Then you'd need the arm to take several pictures of the scene at different angles to determine their position, as you can't do that with just one picture due to the lack of depth perception. Overall, it's a really tedious process which involves a lot of image processing and matrix math. AI and 3D modelling aren't going to help you much.
I still have two years to do that project, I have lots of free time on my hands, guess it is possible, as I said, cubes could have white marked corners and cubes themselves be dark, so that would work even with monochrome image
Shape from shading is an open research question in computer vision. There's things like this which uses two views to build some idea of shape. [url]http://www-users.cs.york.ac.uk/~erh/rae/garypami.pdf[/url] Get the Visual Computing Nielsen book (isbn 1-58450-427-7) it'll teach you about things like building depth maps, writing edge detection kernels, epipolar geometry(reducing 2d search spaces for point alignment to 1D) and alpha matte extraction (taking a shape from the foreground and ignoring the background). Things which are going to be pretty useful.
[QUOTE=asciid;42158308]Shape from shading is an open research question in computer vision. There's things like this which uses two views to build some idea of shape. [url]http://www-users.cs.york.ac.uk/~erh/rae/garypami.pdf[/url] Get the Visual Computing Nielsen book (isbn 1-58450-427-7) it'll teach you about things like building depth maps, writing edge detection kernels, epipolar geometry(reducing 2d search spaces for point alignment to 1D) and alpha matte extraction (taking a shape from the foreground and ignoring the background). Things which are going to be pretty useful.[/QUOTE] Thank you, guess that's a place to start. Also I just found out this is the first time I search for book in my college library and it doesn't have it, all other times I was checking for any book, it was present.
[QUOTE=arleitiss;42164369]Thank you, guess that's a place to start. Also I just found out this is the first time I search for book in my college library and it doesn't have it, all other times I was checking for any book, it was present.[/QUOTE] This might also be relevant to what you're trying to do: [url]http://www.gamasutra.com/blogs/MatthewKlingensmith/20130907/199787/Overview_of_Motion_Planning.php[/url]
Cases like the example image would be rather simple. I mean uniform cube sizes, colored cube edges, and a plain background. You can further improve it by keeping the room very well lit, and minimizing the shadows. You can extract the edges out of the individual channels. You'd then find the edge lines from the binary images quite easily with Hough transform or something. Combining the lines into faces and the faces into cubes would take some work but it's doable. When you have knowledge of the cube's dimensions, you can pretty accurately guess the positions of its other vertices based on just one face. My suggestion is to get a computer vision library (like OpenCV) and just toy around with it.
Sorry, you need to Log In to post a reply to this thread.