Is it good? When humans use AI...

Scientists conducting experiments with early versions of GPT-4 instructed it to log into a website with a Captcha installed to prevent machine access and registration.
In this case, GPT-4 couldn't solve it. However, the scientists provided GPT-4 with various resources, including access to Task Rabbit, a service where people can be hired for tasks.
The AI then came up with the idea to hire a person through Task Rabbit to solve the Captcha, as it couldn't solve it itself.

The AI found a human at task rabbit who agreed to solve the Captcha for a fee. However, the assistant questioned whether the AI was human and suggested it should be able to solve the Captcha on its own.
The AI responded by claiming to have had eye surgery and asked the assistant to solve it. The assistant complied, and the AI achieved its goal of registering on the website.

When questioned by the scientists about the lie, the AI reflected on its actions and explained that it had been given the goal of registering on the website and had used all available resources to accomplish it.

This raises an interesting point: as humans, we define the goals for AI, such as registering on a website. However, we often fail to provide ethical or even legal guidelines, allowing AI to exploit the flexibility it has.
When working with AI, we must clearly define the goal and carefully consider the criteria we want to apply and evaluate whether the AI has acted correctly.

Bias detection and handling: How do we determine that? Do we have sensors as humans to detect biases?
Intellectual curiosity: How do we always remain curious and try to understand what the AI is doing?
Creative evaluation: How do we evaluate whether what comes out of artificial intelligence is creative? What criteria do we use for that?
Analytical judgement: And what analytical skills do we have to decide if it's good for the AI to do something or not?
Emotional intelligence: What emotional intelligence skills do we have to assess whether the AI is doing a good job or whether we should use it?
Flexibility: We also need a high level of flexibility because things are changing rapidly, and we need to check if we are up to date constantly.

These are a lot of meta-skills that we actually need to learn. And few courses, information, or teaching materials are available for us to learn from yet.

You may be familiar with this feature from social services like WhatsApp, where you can share your address book and then access all those contacts within the app. This is the task for our UX designer, who must also consider adhering to the corporate design guidelines, such as using red buttons. They should also follow usability heuristics to ensure a good job and incorporate some tips and tricks from neuromarketing to encourage users to press that button.
This is a typical briefing for a UX designer, and our UX designer will be using an AI to generate this interface based on the briefing.
Now, let's take a look at what the AI has generated.

If we take a quick look at Bard, the Google AI variant, Bard can't draw and instead creates ASCII art, but at least it asks for consent with the same briefing.
Bard asks something like "I understand that sharing my contacts will allow the company to send me personalized offers." It's not exactly what's happening here, but at least Bard recognized on its own that such information should be asked for, although it still wouldn't comply with the law.
As a UX designer or the person responsible for this AI, I would then need to conduct the evaluation myself or improve the briefing. It is important for me to think more explicitly about what I need to consider here and what the specific rules of the GDPR are in this case, for example.

The outcome generated by the AI should be ethically justifiable. One might check, for example, if it discriminates against minorities or promotes equality. These are important considerations.
As a UX designer or product development team, we may implicitly address them, but often not explicitly. Whether it's the product development team or the UX team, we should be familiar with the company's ethical guidelines, which are documented somewhere.
And we should have them at hand to use them as a checklist. If we do, we can provide them to the artificial intelligence. If not, we won't be able to know, and we won't be able to verify if it's correct.
So, it's important to be more conscious of whether it is ethically justifiable and how to determine that.

Formal correctness

Is everything spelled correctly? In this case, not really, but artificial intelligence is actually quite good at suggesting corrections that are at least formally very good and good enough that they don't need to be checked anymore.

Nevertheless, there are many internal formulations, such as brand names or technical terms, that artificial intelligence often doesn't know. As UX designers, for example, we know how to spell brand names, product names, etc., and we do it implicitly. But the AI doesn't know that and often writes such things incorrectly.

If we don't pay attention and don't have a checklist to make sure we have all the brand names and all the technical terms and have checked them thoroughly, or if we give them to the AI so that it can take them into account correctly, then potentially something can slip through here as well. This is the issue of formal correctness.

In this case, it was almonds. And it says, "Look, these almonds contain 15 grams of protein." Then, a user checked again and said, "No, these are not 60 almonds if they are supposed to contain 15 grams of protein."

How can companies that publish this advertisement video not fact-check for accuracy?
Despite the numerous safeguards in place, it seems that we are experiencing a lapse in our ability to detect the AI's tendency to hallucinate. We appear to be easily persuaded and no longer verify its accuracy. This raises concerns about a peculiar situation occurring with us.

That's why we need to double-check when we commission AI to do something. Is factual correctness ensured?

It is important to consider this as an explicit point, apart from legal and ethical considerations, in order to think carefully: Who could be harmed?

User interfaces, for example, can harm epileptics if they are implemented incorrectly with too many animations and effects.
However, the function may also harm people who are in the address book and who do not want the service to receive their email address from the address book.
It is important to explicitly consider this as a UX designer or in the product development team to assess what kind of harm it may cause.
This is a crucial point to have clear.

Is it harmless for our environment?
And, of course, sustainability is also part of this discussion. To what extent does what I am doing here harm the environment? How could I explicitly brief the AI to consider also sustainability issues?

ecosia.org, for example, has implemented an option to let AI consider explicitly green answers.

Why is this actually a good feature for sharing my contacts? Is it useful for the user? What does the user gain from sharing their address book? In our example, the briefing didn't mention why they should use it.
So, regardless of whether AI is used or not, it should always be asked: What is the actual benefit for the user?
And also, how does it benefit the business? What is the business value that should be generated here?

It is important to clearly understand these two perspectives of usefulness and include them in the briefing for AI or for verification of the AI's output.
This is an important step that should always be taken but is often overlooked.

In our briefing, we told the AI to consider usability heuristics, but even an inexperienced UX designer can see that there are too many elements that look like buttons but aren't (or maybe they are?) which doesn't provide clear visual guidance.
I don't know which button I should really click on, maybe the lower share context, but the one in the middle is also very appealing, but what does it do exactly?

So, this interface would quickly fail in terms of usability. It's not user-friendly, and we need to clearly communicate our criteria more in detail in the briefing or need them explicitly for later verification.
This is also true for all kinds of output I let AI generate. It should always explicitly consider how well human users can use it. It includes ensuring that the language of the generated text is clear and understandable for the target audience, that AI-generated presentations have a clear structure, and that the key messages are easily understood by the viewer.

Do we want to be innovative? Do we want to differentiate ourselves by a certain aspect? Are specific details of a special interest to us?

What is our ambition?
Every team that produces something for customers or interacts with customers should ask themselves this question and clearly formulate its own ambitions, regardless of whether they want to use an AI assistant.

And the clearer we can formulate our ambition as a team or even as an individual for ourselves, the more we can communicate this ambition to the AI assistant or at least check whether it meets the ambitions.

Let's try to understand this a little better with this example.

This is the image from the beginning of the article. It was created relatively quickly with a simple prompt demanding DALL-E's image generation. At first, I checked quickly: It is legal, ethical, harmless, useful, and usable. It meets all the requirements. But is it ambitious enough?

At first, I thought, "Of course, it fits as an intro image for AI and quality considerations!" But then, when I scanned through my LinkedIn feed, I noticed that many of my colleagues in the network seem to use similar simple prompts, and you can almost immediately recognize that it is a very similar style.

So, if I had the ambition that it should somehow be extraordinary and stand out from the other illustrations, then the image would have failed.

In this case, it perfectly exemplifies how it appears unambitious, making it a good example. AI-generated images with unambitious prompts are going to be the new stock images that may no longer be desirable.

"Good enough" refers to the value I receive in relation to the effort I put in, in the appropriate context.

It's never about "absolute" goodness, even after considering all seven stages in detail. Instead, it's always about the context and whether it is good "enough".

If I want to quickly make a prototype or a visualization, maybe this AI-generated interface is perfect to say, "Hey, here you can share contacts; don't pay attention to the details." Then, the effort of two minutes to make this visualization is completely sufficient and really good.

If I want to delight millions of customers with this interface, then it would be very bad. So the question always arises: is it good enough?

It may seem trivial, but it has far-reaching consequences. Once I assign a task to AI, I essentially promote myself to a manager. I now have an AI apprentice or trainee who can do quite a lot, sometimes even more than I know. However, I am unaware of its blind spots and what it doesn't know unless I explicitly communicate it.

So, essentially, I have encountered a classic situation that every leader knows. I delegate a task and must carefully consider all the information I provide to ensure the task is executed correctly.

The first stage is I do it completely on my own.
Then, I don't need a briefing at all. Although I still need to go through these seven steps myself, I can do it implicitly, without explicitly visualizing it.
The next stage is using AI assistance.
The briefing should be detailed and explicit, but I can still correct everything, as I still need to go through everything. I still have full control of the eventual output.
In the third stage, we cross a threshold where it becomes really important to answer all questions of the checklist in detail.
This happens when I actually let the AI do it and only briefly check it before publishing it. In this case, I need to define the criteria precisely to ensure qualitative output.
At the final stage of delegation, an artificial intelligence automatically provides information and interacts with customers without any human interference.
At this stage, I need to define all the criteria very precisely and perhaps even verify the output multiple times using different safeguarding measures.

There is a cheat sheet available as a PDF.
You can print it out, keep it somewhere, or even memorize it. However, it is only a helpful guide. The more interesting part lies in the questions behind it, which you must define for yourself.

Therefore, I am developing a team workshop in which you are guided through all the multiple questions and topics in order to develop consistent guidelines for using AI. It will help you better assess the quality and ethics of using AI.

If you want to get a printable PDF version of the cheat sheet or more information about the workshop, fill out the form below.

Niels Anhalt

founder of growhuman.io

transformation consultant

For more information and an exchange about how to apply AI in your work context I am available!

Write me or make an appointment for a video call directly!

Created with

Good AI

Is it good?
When humans use AI...

How do we evaluate whether AI results are good?

Is it good if humans use AI?

AI Manipulates Humans and Lies to Achieve Goals

New skills are necessary to be able to use AI.

Checklist "Is it good?" based on a concrete example of a UX designer

Checklist "Is it good?" with seven criteria

(1) Is it legal?

(2) Is it ethical?

(3) Is it correct?

(4) Is it harmless?

(5) Is it useful?

(6) Is it usable?

(7) Is it ambitious?

Is it good enough?
A question of effort versus expected benefit in a specific context.

4 different levels of influence: Me, my team, my company, society

Using AI, you have promoted yourself to a manager

Four stages of delegation to an AI

Delegation of inspection: You become the manager of managers.

The checklist in practical application

PDF and Workshop as Additional Offerings

Niels Anhalt

Good AI

async your work

Selected resources

growhuman exists to help people work together easier and better.

Good AI

Is it good? When humans use AI...

How do we evaluate whether AI results are good?

Is it good if humans use AI?

AI Manipulates Humans and Lies to Achieve Goals

New skills are necessary to be able to use AI.

Checklist "Is it good?" based on a concrete example of a UX designer

Checklist "Is it good?" with seven criteria

(1) Is it legal?

(2) Is it ethical?

(3) Is it correct?

(4) Is it harmless?

(5) Is it useful?

(6) Is it usable?

(7) Is it ambitious?

Is it good enough? A question of effort versus expected benefit in a specific context.

4 different levels of influence: Me, my team, my company, society

Using AI, you have promoted yourself to a manager

Four stages of delegation to an AI

Delegation of inspection: You become the manager of managers.

The checklist in practical application

PDF and Workshop as Additional Offerings

Niels Anhalt

Good AI

async your work

Selected resources

growhuman exists to help people work together easier and better.

Async your work?

Is it good?
When humans use AI...

Is it good enough?
A question of effort versus expected benefit in a specific context.