How to create a book for children using AI
By John Barron, a retired computer scientist who has had a life long interest in AI
I recently used ChatGPT to write the text and create the images for a children's novel entitled 'Fae - the magical fairy boggart'. The first page of the story is illustrated below and the full story can be accessed by following the link embedded in the title and image. The ChatGPT AI generated the four images and I selected the ones I liked best to include in the book. The text and story was created with minimal human editing and steering along the way.
Please note that any creative work I create is licensed as Creative Commons CC-BY-SA by default. CC-BY-SA means that you, as a reader or editor, are explicitly allowed to make changes if you see fit (you don't need my permission); all that is required under copyright law with this license is that you attribute me, and that you extend the same licence conditions to anything you create from it.
What is OpenAI ChatGPT?
OpenAI is an American artificial intelligence (AI) research laboratory, based in San Francisco and founded in 2015. Elon Musk, one of the original founders, resigned from the board in 2018. GPT stands for 'generative pre-training'. ChatGPT is an artificial intelligence tool or chatbot with a conversational interface that allows users to ask questions in their natural language. The system responds with an answer within seconds. ChatGPT reached 1 million users 5 days after its launch. It's already causing disruption. It has the ability to write and debug computer programs, to compose music, teleplays, fairy tales, and student essays, to answer test questions (sometimes, depending on the test, at a level above the average human test-taker), to write poetry and song lyrics, and more.
An important point to understand is that there is no simple linear algorithm - that's not how it works. Instead, it's more akin to a human browsing stories and images and then creating original work based on that learning using correlation and extrapolation. The difference is that the AI can do it at superhuman speed, drawing on the entire body of human achievements so far. At the moment, it works best in English but it can also function in some other languages, with varying degrees of fluency. Unlike some other recent high-profile advances in AI there is, as yet, still no official peer-reviewed technical paper about ChatGPT.
The Creative Process using ChatGPT and Stable Diffusion
The process I used was very simple. First, I asked ChatGPT to write an outline of the story I wanted to see, which forms chapters 1 and 2. Then I fed each paragraph back into ChatGPT, asking it to expand on that part of the story, and prompting it as I saw fit. This is what then comprised the rest of the chapters, expanding on the tale.
The illustrations were all created by Stable Diffusion (a deep learning, text-to-image model released in 2022) running locally on my computer hardware. I fed each paragraph of text in, and requested four images, from which I picked the one I liked best.
Then I laid the book out with suitable fonts and text flow. I have done this sort of copy-editing before, although I didn't get it quite right at first. I had to adjust it a bit once I received the initial proof copy in print form. The entire process took maybe two or three weeks, from conception to print form, for a 15-page graphic novel for children and adults, fully illustrated, and around 1,600 words of text.
The Future of AI
AI will change the world, just as much as the rise of the internet and the now ubiquitous presence of mobile phones in our lives. Both of which happened in just a couple of decades. AI is already part of our lives, and the world is changing out of all recognition. There is quite a bit of understandable human fear over what is happening, and the speed of the changes. Like most tech, however, it's not possible to put the genie back into the bottle, so the real question is how we adapt to it.
There are unsettled questions over who the creator really is, whether the author is someone such as myself prompting and steering the AI, or is it the AI itself, or is it the original creators whose work was used by the AI to generate the text and images (most likely without permission) to train the AI? Or is it the creators of the AI software? There are many misconceptions around the process, but there is no database of copied images it keeps, and the neural network develops a level of comprehension of what it's been looking at that diverges from the input calibration data that it uses to get started.
We won't really know the answers to any of these questions until they are settled in court, or by legislation. I can report that ChatGPT is not like anything I have used before, and what it creates is new, interesting, and stands on its own as an original creative work, which would not otherwise have existed. And it's changing and developing faster than you might imagine. As things stand, there is a new and vibrant community of people experimenting with these new creative tools, sharing their ideas, and often creating amazing results. Will these early adopters be the successful artists of the future? Do we need a new term for them? Maybe the term 'creatives' better describes what the new process is about.
Note on text
This article was augmented with some content from Wikipedia.