Freelance VFX Generalist & Teacher
My research project was to investigate how the current photogrammetry workflow could be improved upon to make it quicker and cheaper without compromising on quality. The current workflow utilises a technique called focus stacking. This is where multiple images are captured of the subject and within each image the focal point is adjusted. These images are then composited together to create one image in which the subject is in focus. Instead of utilising focus stacking, I had planned to shoot the images with a high F stop. By doing this you reduce the amount of depth of field blurring in the photo thereby removing the need for focus stacking. This is because focus stacking is a slow process and requires additional software to stack the images. A workflow (single image workflow) without focus stacking should be incredibly quicker than a workflow utilising focus stacking. However my plan is to investigate whether focus stacking is crucial to creating highly detailed accurate 3D scans. The outcome of this research project is to identify which workflow produces the best results and in which way improvements could be made to both workflows to improve their speed, cost and quality.
Photogrammetry is the process of generating 3D models from multiple photographs. For example (below), this is the statue of Arthur Lowe (Dad's Army) in Thurmaston.
The photogrammetry process starts with capturing the subject you intend to convert into a 3D model. This can be done by taking tens to hundreds of photos of the subject, each image taken from a slightly different perspective. Whilst there are no hard and fast rules on the amount of images, generally the more the better. With more images, there is a greater chance of overlay between the images which helps to improve the accuracy when it comes to determining the position of each image. For this example, I took some video walking around the statue and then with video editing software. I converted the clip into individual images which resulted in approximately 600 images with plenty of overlap. (Below is the source footage).
Then the images are taken into the photogrammetry software. There are many options available to you. For example, open source software called Meshroom is available to download and use for free. Another which I used in this example is called Metashape by Agisoft. (Right) You can see some of the different images imported into the software. The green ticked means that the image was accurately positioned in 3D space. With multiple reference points, the software is then able to plot out the surface of the subject based off which part is visible in each image.
(Below) This is a screenshot from Metashape. Here you can see how each image has been placed in 3D space. In the center is the pointcloud of the statue. A pointcloud is an object made of small points which have position and colour information. This example was made with a medium setting of pointcloud quality. This results in just over 57,000 individual points. The scans detail in geometry and colour could be dramatically improved by increasing the quality setting. However this would take long to calculate and would result in millions of points which could be slow to view. The pointcloud can either be kept as such as it can then be converted into a watertight 3D model which could either be used as a digital asset in games or vfx, or it could be 3D printed.
(Below) This is the point cloud result from Metashape of the statue. From this you can see how it has accurately reconstructed the statue along with the bench it is sat on and the themed brickwork.
I decided to use insects as my photogrammetry subject as I am thoroughly intrigued by the variety and diversity of insects. I am also interested in the Sci-fi film genre and often Sci-fi creatures are heavily influenced by insects. For example in the film Starship Troopers (1997). It is easy to see how the warrior bugs in the film are grounded in reality in the way that the jaws are inspired by rhinoceros beetles and the colour scheme is similar to a wasp or hornet. Preserved insects look almost identical as when they are alive so insect specimens are a great subject to use in photogrammetry. Not only could the insects be used as design reference and influence for VFX and games, but they could also be used as digital replacements of physical insect specimens.
To begin, I first needed to see how the photogrammetry software works and to identify what type of hardware I would need. Therefore below are several scans I had captured during testing the software.
This was my very first test with the photogrammetry software. The setup was my Sony A6000 with a telephoto lens 55mm-210mm with a Raynox 1.5x Macro Convertor. The scene was lit several desk led lights from Ikea. I taped paper over the leds to help diffuse the light and avoid sharp highlights which would cause a problem with camera alignment. The subject was a dead fly I had found in the greenhouse. Luckily it had been well preserved as it was not destroyed by mites. I used a dressing pin, carefully impaled through the top of the thorax and out the bottom. I then taped this to a helping hand to help lift the fly away from the table.
An early attempt revealed due to the small size of the fly and quality of the image, it had struggled aligning the cameras. I then thought to include a paper sheet underneath the fly with unique markers draw on the paper. I believed as one would use tracking markers in visual effects to help match move a camera track or motion capture. This principle could be applied to this. The paper marker and needle could be removed once the multiple images were correctly aligned. I decided to keep these in for this example to show how the setup looked.
I then acquired some insect specimens from a local insect show. The insect is a reindeer beetle. I applied the same principles as the fly to this capture. However I learnt as I kept using the same 55mm-210mm telephoto lens. I was capturing images far too close to the specimen. This resulted in errors such as areas not being captured or where the front legs have been misaligned leading to the two legs in one.
Another initial idea for my project was to assess the quality of scans captured from a commercially used drone against a hobbyist racing drone. This scan was generated from some GoPro footage mounted on my racing drone. In this I orbited myself standing on top of some rocks on a pebbly beach. The ground I am standing has been capturing correctly and looks quite detailed. However I wasn't captured so well, potentially as a result of flying to close to myself or there not being enough matching images of myself for it to reconstruct. That said you can see some resemblance of myself. For example the blue shirt I was wearing. Pinkish pixels where are my arms and head are, as well as some white pixels on my head where the fpv goggles to control the drone are.
(Above) This is the video I used to use photogrammetry on myself and the rocks.
(Above) Screenshot of the cameras and pointcloud aligned and generated in Metashape (formerly Photoscan).
This is a capture of my cat. I usually had loads of photos of him and luckily I had managed to capture around 30 photos of him sleeping in this position. I have always been fascinated by the numerous contortions and deformations a cat will happily snooze in and I have often found a photo has not been able to capturing that moment physically enough. I had my doubts about this scan. Not only does the cat subtly move in-between images such as face twitching but I also presumed hair and therefore fur would not be captured due to the way light scatters off hair. With that said the software has managed wonderfully with the small and limited data set. Pushing the software to its limits with tasks like this has given my the confidence that this software will not be a major limiting factor in the quality of my insect scans. As with any technology, I believe in a few years time that not only will 3D selfies (physical and digital) become the norm, but they will also extend to our pets.
(Above) These are the photos used in the photogrammetry software.
Another idea I had for my project was to use the photogrammetry software to capture environments. Subsequently after a brief visit to the natural history museum at Wollaton Hall. I also walked through the grounds of the industrial museum. Using just my phones I captured around 40 photos of the courtyard. When it came to reconstruction in the software, again the only problems were areas where there was simply not enough data for it to align and reconstruct. The worse part was the very short roof and pillars to the left. However this could be fixed by placing basic geometry such as a plane for the roof and cubes for the pillars and projecting the images onto those objects. Below is a render with a wartime influence using the captured model as the environment.
(Above) This is the render I put together to assess how well these models can be used in other 3D packages.
(Above) This is a video demonstrating the model and where the cameras have been aligned in the scene.
At this point, I was now ready to begin preparing all the hardware I would need for the two workflows. Most of the hardware was easily sourced except for the platform for the specimens. This had to be custom built for this project and below follows how it was made.
I realised that I would need a platform that I could place the specimens on and rotate freely. Initially I had thought to use an old record player, however this was difficult to procure atleast at an affordable rate. That is when I decided to use a lazy susan, which was cheap enough to buy and provided the rotatable platform I needed. However, the issue with the lazy susan is that the turning of the plate isn't smooth and it would be incredibly difficult to rotate the plate evenly and accurately. Therefore I saw I would need to drive the plate with a motor. After watching several videos of various methods on YouTube, I decided to make my own method by using a stepper motor controlled by an arduino.
Below is the component list of the parts I needed to build the table.
I started making the motorised platform by disassembling the lazy susan. I knew I wanted to mount the motor to the base of the bottom plate, so I started by measuring the dimensions of the motors screw and marked the positions on the plate. Then I drilled holes into the wood where the motor would be screwed onto the bottom plate. I then mounted the motor to the base plate of the lazy susan. Luckily there is a hole in the center of the base plate where the motor shaft could poke through. I placed some nuts between the motor and table in order to keep the tip of the motor shaft level with the top of the base plate. Then with the top plate, I drilled into the bottom so I can glue a pice of aquarium pipe into place. This pipe would then grip onto the motor enabling the plate to turn without causing any damage to the motor if the load on the table is too heavy for the bearing to spin. Then I began connecting to the motor to the driver and then to the arduino with some basic code to make it turn. Once I was confident that the table would turn correctly, I reworked the arduino code to allow multiple operating modes with buttons.
As I began to test the motor with the table on for the first time, I saw the platform would rotate for a few seconds and then stop. This was quite frustrating to deal with as it became a major block in getting the table finished. What I realised was that the original table I was using was far too wide and heavy for the table to rotate it directly. Luckily I had bought two lazy susans and I decided instead of using the large top, I would use the small bottom piece from the second table. After I replaced the top, the table began turning the top plate smoothly and consistently. This worked in my favour as it reduced the form factor and weight of the table, making it easier to move and operate.
As I began capturing my first scans, I realised that the rotating table wasn't completely in sync with the camera timelapse. This was because I had not accounted for the delay that occured from the table turning. For example if an image is captured every three seconds, then I would want the table to rotate to a new position at that same rate creating a three second cycle. However as it takes the table half a second to rotate, the table was operating at three and a half seconds per cycle. This resulted in the table ofen moving during an image capture due to the timing mismatch. To fix this, all I needed to do was alter the delay table has between rotating to account for the time it takes to rotate. Another issue I faced was that the motor and stepper driver were making a high pitched whine noise during operation. Whilst this does not affect the performance of the table, it is quite annoying to be around in close proximity. After doing some online research to rule out power supply and faulty components, I saw that the stepper motor driver had a current limiter. All I needed to do was adjust the current limiter to avoid sending a too high current to the motor.
(Left) This is the final motorised lazy susan. The bottom on both plates have a rubber plastic ring which fits into a milled groove. This comes from Ikea like this and the plastic ring is to help grip the susan to your surface where as the wood would slide. I cut out a piece of white foam to cover the top of the table. This foam is secured to the plastic ring. This gives me the option to remove it or replace it with a different cover.
(Right) This is the underneath of the table. You can see the stepper motor used and how it is mounted directly to the base of the table. The cables from the motor are protected by a sleeve and the cable is held onto the table. This provides strain relief for the motor incase the wire is accidentally pulled to avoid ripping the wires out of the motor. As the motor is quite tall, I needed to raise the table to avoid having the motor in direct contact with the ground. To do this I used some tall sturdy rubber feet. These are screwed into the base like the motor. These feet are tall enough so that is there an air gap to keep cool down the motor as well as being made out of rubber enough plenty of grip for whichever surface it is placed onto.
(Left) This image illustrates the two halves of the lazy susan. On the left is the base plate where the motor and bearing is secured to. The bearing comes with the lazu susan and allows the top plate to spin freely whilst reducing friction between the two plates. The bearing is also quite wide which allows it to evenly distrubute the weight placed onto the top plate. On the right is the top half upside down. In the center of the plate is some thick aquarium tubing. This tubing joins the top plate onto the motor shaft. The tubing is thick and narrow enough that it can tightly grip onto the shaft meaning no glue is needed to hold them together and yet the plates can be seperated so the bearing can be cleaned and regreased if needed.
(Right) This is the control box in which the table and power supply plugs into. Inside the box is the Sparkfun Redboard Arduino and stepper motor driver. Along with soldered PCB and a strain relief for the power supply as I realised with a previous arduino that the power connector could easily snap off. The table can operate without access to the arduino. This is achieved by two different buttons. The green button is a momentary switch. This allows me to turn the table one step at a time, meaning the table rotates when I press the button to make it rotate. The red button is a latching switch which latches in when pressed. With this I am able to keep the table rotating by itself as when the button latches, it continually runs the script that rotates the table. The time between each rotation is set to a constant rate of turning 9 degrees every two seconds.
(Above) This is an image of the image capturing setup for both workflows. I used a Sony A6000 set to capture RAW and JPEGS along with a SEL 30mm Macro lens. Mounted onto the camera was a led ring flash and on top was a led panel. To both sides of the specimen I placed two led panels. During the image capturing, these were placed outside of the lightbox in order to further diffuse the light. I placed the motorised platform with specimen inside the lightbox. This was used to provide a clean even background as any detail picked up in the background could cause alignment errors. The lightbox also helped to diffuse the led panels used to light the specimen. The camera was mounted on a tripod and remotely operated to avoid moving the camera during or between image capturing.
(Above) This is just one of the boxes of insect specimens I had access to. Inside this particular box are various beetle species however the beetles I were particularly interested in capturing ranged from Dicranocephalus to Circulionidae and Xylotrupes.
At this point in my projects development. I was now able to see what exactly I was investigating with the photogrammetry workflow. This was as I began capturing the specimens, I was unable to find any text on whether using focus stacking results in better reconstructed models or if you were able to get away with taking one image with a small aperture. There I realised my project would involve me capturing specimens with two different photography techniques. Comparing the two to see if any workflow produces better results and if it is possible to alter either workflow to improve them further.
For the focus stacking workflow, I decided to use PICOLAY which is an open source focus stacker. I found this software was incredibly quick and accurate when combining the multiple images. I also found it useful that the software supports batch processing. This meant I could group each position into a folder and the software would stack the images inside each folder one after the other.
(Above) This animated image showcases the different images that would be composited together to create one clean image in the focus stacking software.
(Above) This is a screenshot of PICOLAY stacking an image of the tiger beetle. In this example, the image consists of six images. On the left side you can see how each image has been composited into the final image and on the right is the final image.
To find more information and download PICOLAY click here
I decided to use Meshroom to convert the image datasets into the 3D models. Meshroom is opensource software meaning it is free to download and use, whereas Agisoft's Metashape must be paid for. Whilst Metashape is perfectly fine to use, I decided to use Meshroom due to it being ffree and that the whole process from camera alignment to mesh generation and texturing can be completed in one autonomous process. Whereas in Metashape Standard Edition, camera alignment, pointcloud, mesh generation and texturing are treated as seperate stages which have to be manually started. By using Meshroom, this meant if alignment failed. I could test the same image set in Metashape to rule out whether it is a software issue or issue with the images themselves.
(Left) This is screenshot of Meshroom with the tiger beetle scan. In this specific image we can see how the images have been aligned into cameras. The cameras are then able to project point into a pointcloud. You can see how the alignment card and beetle are beginning to form in 3D. It is also quite important to note that the software has correctly identified that the cameras are at different height and rotations each time a full 360 was captured of the subject.
(Right) In this image you can now see that the pointcloud has been converted into a 3D model with texture taken from the images. Meshroom uses a node based system which means each step of the photogrammetry process are treated small processes which when linked together make the photogrammetry pipeline. Meshroom saves the results of each node locally which makes transfering saved files simple. This also means the final model and texture are exported automatically at the end of the process which saves time.
To find more information and download Meshroom click here
These are the final 3D scans captured with both workflows.
(Above) This scan has reconstructed the head, body and top of the legs quite well. It has even projected the textures correctly to ensure that there are only four white spots on the beetles abdomen. However there are issues with the geometry, such as one of the eyes appears concave instead of convex. There are parts of the legs missing and there is uneven geometry around the egde of the beetles thorax.
(Above) This scan of a weevil has captured the head and body of the weevil quite well. However do to its small size and wide lens on the camera. The subject doesn't quite fill the image which means the scan isn't as highly detailed as the rest. Especially when looking at the texture quality of the insect. This is due to the fact that I was unable to get the camera closer to the specimen. Parts of the legs are also missing and whilst the original specimen does have legs crumpled towards the body. The problem of the camera not being close enough has meant the software has not been able to see enough detail in that area to accurately reconstruct it. The best way to fix this to prevent this issue in the future would to either get the camera closer to the subject or to use a long lense.
(Above) I believe out of all of these scans in both workflows. This particular model came out the best. The carapace of the beetle has been reconstructed fairly well. The colour of the beetle has been transferred as well. Important body parts such as the legs and mandibles have been reconstructed as well without any missing parts or duplication issues. However it is not without flaws. One of the antennae is partial missing and like all of the other scans, parts of the white card have been projected onto the beetle.
(Above) With this scan, generally the reconstruction is ok. The wings are quite accurate as they as display as being thin. However the hair on the bee does cause geometry issues with parts of the mesh looking quite uneven and distorted. One issue that does immediately stand out with this scan is that the colours of the 3D model seem more diffused and lighter than the original image.
(Above) This is the worst capture out of them all. Whilst it has captured the thorax and abdomen. It has failed on capturing the antenna and the legs of the beetle. Also due to the high F stop used to capture the images, the ISO had to be raised. This induced quite a bit of noise on the images which has been transferred onto the model. This issue is a lot more noticable in the darker areas of the model.
To conclude this research project. I found that the focus stacking workflow does produce more accurate and consistent results over the single image workflow. This is identifiable as when images from the single image workflow are imported into the software, the images were less likely to be aligned. Such as only a few images would align resulting in an incomplete specimen with detail from only one point of view. Whereas the focus stacked workflow would result in many more images aligning and from multiple positions around the specimen. I think the reason why the focus stacking workflow resulted in data sets more likely to be aligned is due to the background. In the focus stacking images, the background is out of focus. Whereas in the single image workflow the background is a lot closer in focus. Therefore the photogrammetry is picking up parts of the background and as it can see the same features of the background in each shot, the software is recieving conflicting information where it believes the camera is stationary because of the background but it also believes the camera is moving due to the specimen. In the end, the images fail to align because of this. One way to properly test this would be to use a completely flat and even background whereas in my images you can see the unironed nylon of the lightbox.
Going forward, it is my opinion that any photogrammetry scan of an insect specimen should be captured with the focus stacking workflow. By focus stacking, you are not only capturing each part of the specimen in absolute clarity but you are also helping to seperate the specimen from the background as the background is kept unfocused. Both workflows did result in models that either had holes, missing sections or the textures were slightly out of focus. One way that both workflows could be adapted would be to use a longer macro lens. By using a longer lens, you reduce the need for the camera to be close to the subject and due to lens compression of a bigger lens. Lens compression is where the foreground and background appear closer together. Lens compression would result in the specimen appearing a lot closer to the lens, thereby filling the image and would reduce the amount of space the background is visible in the image for.
A lot has changed in a year. I have decided to get into teaching in the Further Education sector by completing a PGCE course. Ofcourse to teach VFX. Shortly through Term 2 of the academic year 2019-2020, the COVID-19 pandemic disrupted normality. This meant that when I had finished my PGCE in June 2020, I suddenly had a lot more free time whilst waiting for job vacancies to emerge. This has given me ample time to refine old skills and learn new techniques and software, as well as take another look at the result of my Masters project and reflect on it a year later. By reflect, I don't mean What, So What and Now What (Borton 1970) as I have had plenty of time to do that both on the Masters and on the PGCE. Whilst I enjoyed the course as it allowed me to take advantage of exciting opportunites. I felt that the scanned insects I had produced were frankly dire and gave me little hope for my own skillset with photogrammetry. As I believe the images captured should still be viable for reconstruction, I decided to see what improvements have been made to the Photogrammetry software one year on.
At the time of June 2019, the three big players were Meshroom, Metashape (formerly Photoscan) and Reality Capture. I was only able to test the first two as Meshroom is free open source and Metashape was fairly inexpensive, Reality Capture was rather expensive. From my own experience, the results from those two software are fairly similar. Meshroom is great as all you need to do is press one button and it aligns, meshes and textures the object for you in one workflow. Whereas Metashape requires you to interect with the software at each stage; alignment, reconstruction and texturing. Metashape does allow you more precise control over the settings and editing the data which is a positive as the process with reconstruct data they you may not need. However both software are fairly slow to scan your models, one object could take a day or even longer. As of June 2020, both software have recieved little improvement, which made me look elsewhere. Below is the Weevil model captured with Meshroom. There are major artefacts in the model from holes to missing sections of legs. Ultimately this is not a great result for my workflow.
At this point, I decided to take another look at Reality Capture and see how much the software costs to use. Whilst personally I find the subscription model hard to engage with as I cannot guarantee I'll use the software enough to justify the cost, and the perpetual license is a little high for my current budget. It is a pleasent relief to see Reality Capture has released a new pricing model alongside its current models. Pay-Per-Input (PPI) means you can use the software for free with no limitations other than exporting out results. If you like the result, you can pay to license the software for that one model. How much each model costs depends on the images you use however from what I've seen, it is easy to create a decent scanned model for cheap (≤£8). This meant I was able to try out Reality Capture to see if I could salvage my project and assess if the software lives up to the hype. Therefore I decided to experiment with the software by rerunning my Weevil data through it as I believe it was the data set I had the most success with (below).
After a few hours experimenting with Reality Capture, I can believe the hype the surronds it. First, it's incredibly fast compared to the other two software. This is important as you'll find yourself spending a lot of time tweaking settings to get perfect captures and it can be tiring to have to wait 6 hours before finding out whether adjusting one setting improved the quality or not. This isn't a problem with Reality Capture. Secondly, its alignment is consistent and reliable. Again with the other two, I found they were either or which again is frustating when trying to bug fix issues. Thirdly, it is a lot better for distingusing the foreground from the background. In my previous scans you can see the insect legs are blended with the white background. In Reality Capture, there is a nice distinction between the foreground and background. As I say, I have only just been able to find the time to experiment with Reality Capture and so far I'm impressed. For anyone considering getting into Photogrammetry, the new PPI model opens up so much potential for students and hobbyists and I would greatly recommend trying it out for yourself! This leaves me to say, the show goes on. The new software has made me realise that the analogy "a bad worksman blames his tool" may not always be the case. From my initial tests I can rule out that my image capturing process is not an issue as I originally believed. This is a big relief as I thought that the 1 terabyte of data I had created was useless. Reality Capture opens up so much potential and hopefully I will be posting some cool scans soon!