The early years of AI — full of optimism that lasted roughly 15 years.
History 8 min Beginner April 13, 2026
In 1950, a mathematician asked a question that no one could answer: "Can machines think?" Rather than getting stuck in philosophical debate, he proposed a simple test — and launched a new field of inquiry.
Six years later, a group of researchers gathered at Dartmouth College and gave that new field a name: Artificial Intelligence. The programs they built were brilliantly simple and severely limited. Understanding why they failed is the key to understanding modern AI.
The Question — Turing's Imitation Game
Why did Turing reframe the question? "Can machines think?" is philosophically unanswerable. Turing replaced it with an observable question: Can they behave in a way that we cannot tell the difference? Not consciousness is tested, but behavioral imitation.
The Turing Test
AnalogyDefinition
Imagine chatting with someone on WhatsApp and being unable to tell whether it's a real person or a chatbot. If you can't figure it out despite actively trying, the bot has passed a kind of Turing Test.
Analogy:
Imagine chatting with someone on WhatsApp and being unable to tell whether it's a real person or a chatbot. If you can't figure it out despite actively trying, the bot has passed a kind of Turing Test.
Definition:
A behavioral test proposed by Alan Turing in 1950. A human interrogator communicates by text with two respondents — one human, one machine. If the interrogator cannot reliably distinguish them, the machine is said to pass. The test deliberately avoids defining "thinking" and focuses on observable text behavior instead.
Where the analogy breaks: In the real Turing Test, the interrogator actively probes with trick questions and sudden topic switches — a casual chat has a much lower bar. The WhatsApp analogy understates the adversarial nature of the test.
1950
Turing's Paper Published 1950 in "Mind", Vol. LIX(236)
In 2024, researchers (Jones, Bergen et al.) ran controlled Turing Test experiments with GPT-4. Over 500 participants tried to distinguish AI from humans in 5-minute text conversations. Many failed — but trained interrogators could exploit systematic weaknesses: hallucinations, inability to discuss personal experiences. This shows that "passing" is context-dependent and not proof of understanding.
Common Misconception: Passing the Turing Test = Being Intelligent
The Turing Test measures behavioral imitation, not understanding. A machine can pass by convincingly mimicking human behavior — without "understanding" anything. A famous thought experiment later in this article shows why symbol manipulation does not equal comprehension.
The Chinese Room
In 1980, philosopher John Searle formulated a famous thought experiment: Imagine you sit in a closed room. Through a slot, you receive slips of paper with Chinese characters. You don't speak Chinese, but you have a rule book that gives you the appropriate response for every character combination. You push the answer back through the slot. From outside, it looks like you understand Chinese — but you're merely manipulating symbols according to rules, without grasping any meaning.
Searle's argument: This is exactly what computers do. They manipulate symbols according to rules without "understanding" the meaning. Passing a Turing Test proves behavioral imitation, not understanding. The question is more relevant today than ever: Large language models pass informal Turing Tests but show systematic weaknesses like hallucinations and lack of causal reasoning.
Turing's Nine Objections
Remarkably, Turing anticipated virtually every later criticism of AI back in 1950. In his paper, he addressed nine objections:
The theological objection: Only humans have souls, so machines cannot think. Turing: Why should God not be able to give a machine a soul?
The mathematical objection (Gödel's incompleteness theorem): Formal systems have limits. Turing: Humans also make logical errors.
The consciousness argument: Machines feel nothing. Turing: We cannot prove that other humans "really" feel either.
Lady Lovelace's objection (1843): Machines can only do what they are explicitly told. Turing: This may be true for simple machines — but could a learning system not exhibit surprising behaviors?
These objections are not historical curiosities — they are at the heart of modern AI debates. The question of whether large language models "understand" or merely imitate is essentially Turing's original question in new form.
The Name — Dartmouth 1956
Before Dartmouth, researchers in logic, automata theory, and cybernetics (the science of control and communication) worked on related problems — but under different names and without a shared identity. McCulloch and Pitts had developed a formal neuron model in 1943. Norbert Wiener coined "cybernetics" in 1948. Turing published his test in 1950. The ideas existed — they lacked a name.
John McCarthy deliberately chose the term "Artificial Intelligence" to separate from cybernetics. In August 1955, McCarthy, Minsky, Rochester, and Shannon drafted a proposal for a summer workshop at Dartmouth College.
1956
Dartmouth Conference The summer workshop that established AI as a discipline
John McCarthy Organizer, Dartmouth/Stanford. Coined "AI", created LISP
Marvin Minsky MIT. Pioneer of symbolic AI and neural networks
Allen Newell Carnegie Mellon. Co-creator of Logic Theorist and GPS
Herbert A. Simon Carnegie Mellon. Logic Theorist, Nobel Prize in Economics
Claude Shannon Bell Labs. Founder of information theory
Nathaniel Rochester IBM. Chief architect of IBM 701, simulated neural networks
Participants of the Dartmouth Conference
Their bold conjecture: "Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."
The conference itself produced no technical breakthrough — no groundbreaking paper, no invention. What Dartmouth delivered was something else: a name, a community, and a shared research agenda. Dartmouth was a social milestone, not a technical one.
The Dartmouth participants were remarkably optimistic. They believed machines could achieve human-level intelligence within a generation. This overconfidence would haunt the field for decades — and lead directly to the AI Winters.
The First Programs — Brilliant and Brittle
The first generation of AI programs (1955-1970) shared a common approach: handcoded symbolic manipulation. Rules, patterns, and formal logic — with no learning from data whatsoever. They worked perfectly within their narrow, programmer-defined worlds and failed completely outside them.
The Logic Theorist (1955-56) by Newell, Simon, and Shaw proved 38 of 52 theorems from Chapter 2 of the Principia Mathematica (a foundational three-volume work on mathematical logic by Whitehead and Russell, published 1910-1913). For Theorem 2.85, it even found a shorter proof than the original. Its method: heuristic search in state space — essentially a map of all possible solution paths — exactly the principle formalized in the next article (Graph Search).
38 / 52
Theorems Proved Logic Theorist: 38 of 52 from the Principia Mathematica
ELIZA (1966) by Joseph Weizenbaum simulated a psychotherapist — using nothing more than keyword detection and text templates. No grammar analysis, no world knowledge, no memory between sentences.
Select template: "Tell me more about your family."
4
Output: Sounds empathetic — but understands nothing
The ELIZA Effect: Weizenbaum's own secretary asked him to leave the room so she could talk "privately" with ELIZA. Humans project understanding onto systems that merely process surface patterns — because we tend to interpret coherent responses as evidence of comprehension.
SHRDLU (1968-70) by Terry Winograd understood natural language — but only within a world of geometric blocks. "Pick up the red block and put it on the blue one" worked perfectly. But "Why is the red block on the blue one?" was impossible. SHRDLU is the archetypal "toy world" problem: perfection in a simplified world that collapses in the real one. Looking back at this first generation of AI programs, a clear pattern emerges:
What they could do
Impressive behavior in narrow domains: mathematical proofs, convincing conversation, natural language understanding.
Where they failed
Immediate failure outside the programmed world. Cause: handcoded rules only cover what the programmer anticipated.
The shared architecture of all three programs — handcoded symbols and rules — is not an individual weakness but the fundamental limit of the symbolic paradigm. This brittleness is why the field eventually sought alternative approaches: learning from data instead of manual programming.
Common Misconception: Early AI Was Primitive and Unimportant
The first AI programs were not "primitive" — they were brilliant within their limits. Logic Theorist found more elegant proofs than human mathematicians. ELIZA demonstrated the power of surface patterns. SHRDLU showed that natural language processing is possible in principle. Their concepts (heuristic search, pattern matching, state spaces) remain fundamental to this day.
Interactive: Milestones of Early AI
Click on the milestones of early AI history. Notice how each breakthrough builds on previous work — from Turing's philosophical question through the naming of the field to the first programs that were both brilliant and brittle.
👆Click on a point in time to learn more.
1950
Turing's Paper
Alan Turing publishes "Computing Machinery and Intelligence" in the journal Mind. Instead of asking the unanswerable question "Can machines think?", he proposes an observable test: the Imitation Game.
Significance: The Turing Test avoids philosophical dead ends and makes intelligence measurable — an idea that shapes AI debates to this day.
1 / 5
Summary
AI was not invented but declared — Dartmouth 1956 gave a name to a convergence of existing ideas from logic, computation, and psychology.
The Turing Test measures behavioral imitation, not understanding — a distinction that matters more today than ever with large language models.
Early AI programs proved that intelligence-like behavior is achievable in narrow domains, but handcoded rules cannot scale — paving the way for search algorithms and machine learning.
Logic Theorist and GPS treated problem-solving as searching through a space of possible states. The next article formalizes this idea — breadth-first search, depth-first search, and state spaces are the mathematical framework behind what these early programs did intuitively.
Quiz: The Birth of AI
Question 1 / 5
Not completed
What does the Turing Test actually evaluate?
1. What does the Turing Test actually evaluate?
☐ A) Whether a machine has consciousness
☐ B) Whether a machine can fool a human interrogator in text-based communication
☐ C) Whether a machine can solve mathematical problems
☐ D) Whether a machine has emotions
2. A customer service chatbot answers questions so well that 80% of users believe they are talking to a human. A colleague claims: "This proves the chatbot understands customer problems." What is the most accurate response?
☐ A) Correct — if users cannot tell the difference, the system must understand
☐ B) Incorrect — the Turing Test only measures behavioral imitation, not genuine understanding
☐ C) Incorrect — the chatbot would need to pass a formal Turing Test, not just fool customers
☐ D) Correct — but only if the chatbot can also solve novel problems
3. Why is the 1956 Dartmouth Conference considered the "birth" of AI?
☐ A) Because the first AI program was presented there
☐ B) Because participants solved the problem of machine intelligence
☐ C) Because it gave the field a name, a community, and a shared research agenda
☐ D) Because the first neural network was developed there
4. Logic Theorist, ELIZA, and SHRDLU were all successful in their narrow domains but failed in the real world. What best explains the shared root cause?
☐ A) The computers of the 1960s were too slow for complex AI
☐ B) The programmers did not have enough data to train the systems
☐ C) All three relied on handcoded rules that could only cover what the programmer explicitly anticipated
☐ D) The Turing Test set too high a bar for these early programs
5. A friend builds a chatbot that uses keyword matching (like ELIZA) to respond to user inputs. Users report feeling "understood." What phenomenon does this illustrate?
☐ A) The Dartmouth Effect — users overestimate AI because of its prestigious academic origins
☐ B) The ELIZA Effect — humans project understanding onto systems that merely process surface patterns, because we tend to interpret coherent responses as evidence of comprehension
☐ C) The Chinese Room — the chatbot actually understands the keywords it detects
☐ D) The Turing Effect — any system that produces natural language is genuinely intelligent
Answer Key: 1) B · 2) B · 3) C · 4) C · 5) B
Knowledge Check
What does the Turing Test actually measure and why is it not proof of genuine understanding?
Why was the Dartmouth Conference a social milestone rather than a technical breakthrough?
What shared weakness did the first AI programs have and why did it cause them to fail?