• en
  • de
  • Public Offering

    The AI-generated images trend and what it could mean for NLG [Part 1]

    by Sylvia Nguyen, NLP Engineer

    Introduction 

    This blog post delves into the trend of so-called “AI Art” that is currently flooding social media platforms such as TikTok or Twitter. It covers the topics of what defines Artificial Intelligence, what AI Art and AI-generated images are and what is currently trending.

    What is AI? 

    Before this blog post examines the topic of AI Art, it is worth clarifying what is meant by AI in the first place. AI stands for “Artificial Intelligence” and describes a broad field of computer science that trains Neural Networks and uses complex algorithms to perform specific tasks. These tasks include, for example, speech recognition, translation or personalization of content.

    AI technologies are often composed of machine learning algorithms, deep learning, natural language processing and other techniques that allow computers to “learn”, i.e. discern patterns, from given data and make predictions or decisions based on that learning. AI has a wide range of applications across industries such as healthcare, finance or transportation.

    And what is AI Art? What is the difference to AI-generated images?

    “AI Art”, on the other hand, is a type of application of AI technology that focuses on using AI algorithms to create, modify or enhance images.  Upon hearing the term “AI Art” one might wonder what actually falls into this category. 

    Every art form from images, videos, to music or literature that is generated by an Artificial Intelligence, may be referred to as “AI Art”. This blog entry however mainly focuses on artificially generated images. Furthermore, this article will be addressing images generated by text-to-image models as AI-generated images out of respect for human artists. 

    Tools such as Stability.AI’s Stable Diffusion or Midjourney’s Midjourney are currently very popular and quickly come to mind when talking about generative text-to-image models. So how exactly do they work? These models are generally a combination of a language model that processes an entered natural language text as an input and translates it into a latent, machine-readable representation as well as a generative image model, which outputs a result based on the input and its representation. Before the combination with the language model, the image model was trained on a large dataset of text-image pairs. According to Xinyue Shen et al. (2023) the models may vary in the architecture of the language and image generation models and make use of different techniques such as GAN, diffusion models, or transformer.

    The required input is usually provided by a user who can insert a prompt according to their own liking and vision (all within the confines of community guidelines of course) and let the magic happen without the need to move another finger. In some cases however, the user has a few more options pre-generation to work with: they may either choose from different art styles like in Dream by WOMBO or even upload their own pictures as references. The user can then either decide on saving one of the given results or investing more time into refining the prompt for details, maneuvering through variant generations or upscaled versions of a selected image like in Midjourney.

    Another popular use case on the social media platform TikTok is the “AI Portrait”-effect that lets users create impressive pictures of themselves in other timelines or as a fictional character. OpenArt’s AI Model Workspace offers this kind of personalized AI images by asking the user to upload 20 photos, e.g. portraits, to get pictures of themselves in multiple styles for personal use, while being completely prompt-free. The range of tools and use cases for AI-generated images seem limitless.

    And to emphasize the high quality of the results generated by one of the current AI-powered image generation tools, I would like to share some images below.

    AI-generated images – What is trending at the moment?

    A new trend of AI and especially AI-generated images has started to flood social media platforms with the increasing advancements and the resulting attention of Artificial Intelligence technologies. Platforms such as Instagram are saturated with realistic and surreal images of often dream-like quality, while TikTok offers photo effects such as “AI Manga” or “AI Painter” or countless AI focused videos. There is an abundance of AI-based content: from users who are increasingly using and showing the results of their experiments with AI image generators to informative videos about Artificial Intelligence read by an AI-created avatar or tutorials about the use of helpful AI tools in all kinds of specialized fields. 

    And even when it’s not explicitly mentioned, nowadays AI technology can be found almost everywhere: in personalized advertisements on Instagram shopping recommendations on Amazon, in speech recognition software or practical applications such as spam mail filtering.

    But it is only since the increasing supply of AI-generated images, which have become accessible to the masses through social media, that Artificial Intelligence has become more of a focus of public attention. A growing number of social media users are turning to the vast amount of AI-driven generative tools to generate unique and sensational images for their profiles, which results in stunning and attention-grabbing works. It is precisely these surreal “works of art”, which seem perfect at first glance, that achieve a massive number of likes and comments on the platforms and attract more and more curious users, who themselves want to try using these tools.

    Social media made it easy to gather an audience and spread information and content, as well as to give access to test tools to the average person. Stable Diffusion and Midjourney for example offer users to join their Discord server which serves as a forum for discussion or a platform to test the tools, while enjoying the comfortable familiarity of a potentially known environment. Young people in particular (and therefore potential new users or promoters) can be reached more easily in this way.

    In the meantime however, AI-generated images have also found their way into the commercial market or domains beyond personal entertainment: for example, according to Time, 28-year-old Ammaar Reshi published his children’s book “Alice and Sparkle” on Amazon’s Kindle Direct Publishing platform in December 2022. The special feature of this is that the book’s content as well as illustrations were created with the help of ChatGPT and Midjourney.

    In another case, the New York Times reported that Jason M. Allen’s artificially generated work “Théâtre D’opéra Spatial” won the “New digital artist” category in the annual Colorado State Fair art competition, making it one of the first AI-generated images to do so.  In this case, Midjourney was also used for image generation.

    Another case caused a great commotion: as reported by TechCrunch the release model of Stable Diffusion was leaked on the discussion platform 4chan and was misused for pornographic purposes. With the help of the open-source image generation model users had generated nude deepfakes of celebrities. As a result, Stability AI implemented measures to prevent offensive generation.

    In all examples people expressed concerns and criticism as well as admiration for the resulting outputs. Some felt offended, others shared their worries. In any case, the utility and value of AI-generated imagery to society is highly controversial at this point. And some of the debated issues will be discussed in the upcoming second part of this post. So stay tuned!

    Interesting articles to delve deeper into the matter

    AI-Generated Comic Book Could Lose Copyright Protection

    AI Timeline for Text-to-Image Machine Learning Models