Getting Started with the Claude API: Your First AI-Powered App
The Claude API gives you programmatic access to Anthropic’s Claude models — the same intelligence behind Claude.ai, exposed as a simple HTTP API with official SDKs for Python and TypeScript. Whether you’re building a chatbot, a document summarizer, a code reviewer, or an AI-powered CLI tool, the setup is the same. This guide gets you from API key to working application.
Setup
Install the Anthropic Python SDK:
$ pip install anthropic
Get your API key from the Anthropic Console and store it as an environment variable:
$ export ANTHROPIC_API_KEY="sk-ant-..."
Never hardcode the key in source files — read it from the environment.
Your First Message
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY automatically
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain recursion in one paragraph."}
]
)
print(message.content[0].text)
$ python main.py
Recursion is a programming technique where a function calls itself to solve a
problem by breaking it down into smaller instances of the same problem. Each
recursive call works on a smaller input until it reaches a base case — a
condition that returns a result directly without further recursion — and then
the results cascade back up through all the calls. A classic example is
computing factorials: factorial(5) calls factorial(4), which calls
factorial(3), and so on until factorial(1) returns 1, and the chain resolves
back upward: 1 × 2 × 3 × 4 × 5 = 120.
The Messages API Structure
Every request sends a list of messages with alternating user and assistant roles:
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What's the population of that city?"},
]
)
The messages list is how you maintain conversation context — include the full history of the conversation in each request. Claude doesn’t maintain state between requests; you do.
The response object:
print(message.model) # claude-sonnet-4-6
print(message.stop_reason) # "end_turn" or "max_tokens"
print(message.usage.input_tokens) # tokens in your messages
print(message.usage.output_tokens) # tokens in the response
print(message.content[0].text) # the response text
System Prompts
A system prompt sets the context, persona, and instructions for Claude’s behavior. It’s the first thing in the request, before the conversation:
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="""You are a senior code reviewer specializing in Python.
When reviewing code, you:
- Point out bugs and potential exceptions
- Suggest more idiomatic Python patterns
- Comment on readability and maintainability
- Keep feedback actionable and specific""",
messages=[
{"role": "user", "content": """Please review this function:
def get_user(id):
users = load_all_users()
for u in users:
if u['id'] == id:
return u
"""}
]
)
print(message.content[0].text)
Streaming Responses
For long responses, streaming lets you display text as it’s generated rather than waiting for the full response:
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{"role": "user", "content": "Write a short story about a robot."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print() # newline after streaming completes
# Access final message after stream
final = stream.get_final_message()
print(f"\nTokens used: {final.usage.input_tokens + final.usage.output_tokens}")
Building a Simple Chatbot
A complete interactive chatbot that maintains conversation history:
import anthropic
client = anthropic.Anthropic()
conversation_history = []
def chat(user_message: str) -> str:
conversation_history.append({
"role": "user",
"content": user_message
})
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system="You are a helpful assistant. Be concise and direct.",
messages=conversation_history
)
assistant_message = response.content[0].text
conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return assistant_message
def main():
print("Chat with Claude (type 'quit' to exit)\n")
while True:
user_input = input("You: ").strip()
if user_input.lower() in ("quit", "exit"):
break
if not user_input:
continue
response = chat(user_input)
print(f"Claude: {response}\n")
if __name__ == "__main__":
main()
$ python chatbot.py
Chat with Claude (type 'quit' to exit)
You: What's 15% of 240?
Claude: 15% of 240 is 36.
You: And 20% of that result?
Claude: 20% of 36 is 7.2.
Error Handling
The SDK raises specific exceptions you should handle:
import anthropic
client = anthropic.Anthropic()
try:
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
print(message.content[0].text)
except anthropic.AuthenticationError:
print("Invalid API key — check ANTHROPIC_API_KEY")
except anthropic.RateLimitError as e:
print(f"Rate limit hit — back off and retry: {e}")
except anthropic.APIStatusError as e:
print(f"API error {e.status_code}: {e.message}")
For production, add exponential backoff on rate limit errors:
import time
import anthropic
def create_with_retry(client, max_retries=3, **kwargs):
for attempt in range(max_retries):
try:
return client.messages.create(**kwargs)
except anthropic.RateLimitError:
if attempt == max_retries - 1:
raise
wait = 2 ** attempt # 1s, 2s, 4s
print(f"Rate limited. Waiting {wait}s...")
time.sleep(wait)
Choosing a Model
| Model | Best for |
|---|---|
claude-opus-4-7 |
Complex reasoning, nuanced writing, difficult coding tasks |
claude-sonnet-4-6 |
Balanced performance and cost — the everyday workhorse |
claude-haiku-4-5-20251001 |
Fast, inexpensive — classification, extraction, simple Q&A |
Start with Sonnet. Switch to Opus for tasks where quality matters more than cost; switch to Haiku for high-volume, simpler tasks.
Token Limits and Cost
Each model has a context window — the maximum total tokens (input + output) per request. Pricing is per million tokens, billed separately for input and output.
# Estimate token count before sending (rough: 1 token ≈ 4 characters)
def estimate_tokens(text: str) -> int:
return len(text) // 4
# Or use the SDK's token counting endpoint
count = client.messages.count_tokens(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Your long text here..."}]
)
print(f"Input tokens: {count.input_tokens}")
TypeScript Example
The TypeScript SDK follows the same pattern:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic(); // reads ANTHROPIC_API_KEY
const message = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello, Claude!' }],
});
console.log(message.content[0].type === 'text' ? message.content[0].text : '');
Conclusion
The Claude API is straightforward: create a client, pass a messages array with user/assistant turns, and optionally a system prompt to set behavior. The key things to internalize are: Claude is stateless (you manage conversation history), use streaming for real-time UX, and pick the right model tier for your use case. From this foundation you can build summarizers, code reviewers, document chatbots, and anything else that needs language understanding.