Ran Bensimon - Creative Technologist - 3D Modeling and Stable Diffusion Techniques

Lara Croft, the iconic protagonist of the "Tomb Raider" franchise, has been a symbol of adventure, courage, and innovation for generations. Her creation and evolution have been a testament to the advancements in 3D modeling, rendering, and character design. This article explores the traditional methods used in creating Lara Croft and contrasts them with an innovative approach using Stable Diffusion #SDXL Models and Low Rank Adaptation (#LoRA) training. The comparison sheds light on how these techniques could be game-changers in the field of image creation.

Image source: wikiraider.com (AI Upscaled)

The Creation and Evolution of Lara Croft

Lara Croft was created by Toby Gard, a British developer at Core Design, and first appeared in 1996. Her design has evolved over the years, with different developers like Crystal Dynamics taking over and making significant changes to her appearance, backstory, and gameplay mechanics. Her journey through the digital landscape has seen her transform from a 400-polygon figure to a stunningly realistic character.

Image source: Reddit

Traditional 3D Modeling and Rendering

Low Poly and High Poly Models: Lara's creation involved both low poly models for efficient real-time rendering and high poly models for detailed cinematic scenes.

Image source: Google Images

Rigging and Animation: Rigging involves creating a skeleton for the 3D model, allowing it to move in a realistic manner. Animation adds life to the character through motion and expression.

Image source: khatoblepas

3D rigging and Stable Diffusion's #ControlNet #Openpose are both concerned with the representation and understanding of body movement but operate in different contexts and through different technological approaches. 3D rigging is a process used in computer animation to create a digital skeleton for a 3D model, allowing for lifelike movement and detailed control over deformations. It's primarily used in 3D animation for films and video games and can be complex and time-consuming, relying on specialized 3D modeling software.

Image source: khatoblepas

On the other hand, ControlNet Openpose is a deep learning-based method for human pose estimation in 2D images and videos. It uses neural networks to identify key points on the human body and estimate poses, adapting to various scenarios and lighting conditions. While also complex, it relies on deep learning frameworks and has applications in computer vision tasks such as human-computer interaction and augmented reality. In essence, 3D rigging focuses on creating movement within a virtual environment, while ControlNet Openpose interprets and understands movement within real-world images and videos.

Texturing and Shading: These processes added surface details and lighting effects, creating a lifelike appearance.

Rendering Engines: Advanced renderers, similar to modern tools like Octane, Redshift, or Arnold, were used to produce high-quality images and animations.

Fan Art and Augmented Reality, Created by me

Short animation: Original Tomb Raider Chronicles model (from the 2000s) extracted by E.Carnby and RoxasKennedy from within the video game files. Blender porting, animation, boots, environment, camera, and sound design by me. Rendered in OTOY #octanerender.

Short animation: Original Tomb Raider Chronicles model. Blender porting, animation, physics, boots, environment, camera, and sound design by me. Rendered in Octane.

Instagram AR filter: Replacing the user's head with Lara's, reactive to movement, tap gestures, facial animations and more. Try it yourself here. Created in Meta #SparkARStudio

Another Instagram AR filter: Adds Lara to the scene, interacting with the user, reactive to movement, tap gestures, facial animations and more. Try it yourself here. Created in Meta #SparkARStudio

TikTok AR Effect: Making use of the in-game model to track the user's body in Realtime. Try it here. another view can be seen here. Created with #EffectHouse

AI-Powered Reimagining: Stable Diffusion

Stable Diffusion represents a cutting-edge application of deep learning, designed to generate images through various creative techniques. These techniques include filling in missing parts of an image (inpainting), extending the boundaries of an image (outpainting), and transforming text descriptions into visual representations (text-guided image translations). Together, these methods allow for the creation of high-quality visuals, including videos.

In the context of reimagining Lara Croft, I used Stable Diffusion to train a model that captures the essence of the iconic version of Lara from the classic games, specifically in her high poly format. The training process was conducted over 10 epochs, a carefully chosen number that provided the AI with sufficient learning time.

Official Renders of Lara Croft Throughout the Years

An epoch, in machine learning terms, refers to a complete cycle of learning through a dataset. To illustrate, if one were to learn from 50 images, reading each 10 times, a single epoch would encompass 500 trainings (50x10). Two epochs would double this, resulting in 1000 learning iterations.

Upon completion of the specified number of epochs, a LoRA (Low-Rank Adaptation) file is generated and saved to a designated location. LoRA is a method that accelerates the training of large models while consuming less memory, making it a vital component in this process [2, 3].

This technique is revolutionary in that it encapsulates all the facets of 3D creation without the need for additional 3D work. It represents a paradigm shift in image creation, offering a new pathway for digital artists and a glimpse into the future of AI in art.

Stable Diffusion Animations

#Deforum (and #warpfusion) is a platform/extension that leverages Stable Diffusion technology to enable users to create personalized animations. The animations below featuring Lara Croft in a MUGLER Bodysuit, combining two trained LoRA models to craft a unique visual blend of Lara with contemporary fashion.

Mugler Bodysuit SDXL model is available to download and experiment with on Civitai - https://civitai.com/models/123699/muglerbodysuitxl

In conclusion, the reimagining of Lara Croft through AI-powered techniques like Stable Diffusion and LoRA training represents a remarkable fusion of technology, art, and creativity. This exploration serves as a case study to demonstrate the capabilities of new technologies and also an artistic tribute to this important character. While the use of famous characters can raise questions about intellectual property, this work is intended for educational and artistic expression, highlighting the exciting possibilities in the field. The journey of reimagining iconic characters like Lara Croft serves as both an inspiring testament to human creativity and a reminder of the potential and innovation in digital artistry.

Finally, I want to thank Dr. Furkan Gözükara for the extensive knowledge sharing and databases. Definitely check out his resources if you want to learn more about Generative AI.

Stay tuned for more content like this.