Appendix B — Disclosure: Use of AI in Authoring

Purpose

A handbook that teaches students how to use AI tools responsibly would be hypocritical if it quietly pretended its own authors did not. This appendix is a transparent record of how large language models were used in drafting, editing, and maintaining the INFO Missing Manual, plus a discussion of the editorial choices we made about what the AI was and was not allowed to do.

We are writing this in the voice of the human authors, not the model — Brian C. Keegan and Abram Handler. If you are using this book in a classroom and want a concrete example of what “AI disclosure” looks like for a piece of academic work, the first half of this appendix is it. The second half discusses the harder editorial and pedagogical questions the tools forced us to confront.

B.1 What AI tools were used

During the drafting and conversion of this handbook we used:

Claude (Anthropic), accessed through Claude Code, to draft prose, convert LaTeX source files into Quarto, and assist with editing passes. Claude wrote first drafts of several chapters in Chapter 7, Chapter 15, Chapter 20, Chapter 21, Chapter 18, Chapter 24, Chapter 34, Chapter 5, Chapter 19, Chapter 22, Chapter 23, and this appendix. Claude also drafted the chapter-to-meme template assignments now stored in each chapter’s meme: frontmatter (curated and reviewed by the human authors); see Chapter 35 for related context.
ChatGPT (OpenAI), occasionally, as a second opinion during outlining.
GitHub Copilot, inside the editor, for small-scale code completion in worked examples.
A few earlier chapters were drafted by the human authors before AI assistance was introduced; those chapters were later edited with AI assistance but are substantively human-written.

The model weights, providers, and versions were chosen based on access and cost, not endorsement. A handbook written today would include somewhat different tools than one written a year ago; we expect the list above to keep shifting.

B.2 What the AI was asked to do

We used the models for roughly five kinds of work, in decreasing order of editorial latitude:

Mechanical conversion from LaTeX to Quarto. Chapters originally written in LaTeX were converted to .qmd with a pandoc + Python cleanup pipeline (see the commit history and the /tmp/convert_chapter.py script discussed in CLAUDE.md). The AI wrote the cleanup script after being shown sample input and desired output. This was the safest and most deterministic use of the tools.
Drafting gap chapters. For a handful of chapters the human authors had already scoped but not written, we asked Claude to draft the prose from a detailed outline. We specified the canonical 8-section structure (see Chapter 5), the target audience, the tone (“friendly guide, second-person, empathetic”), the length, and the substantive points each section had to make. The model produced a first draft; the human authors then read, edited, reorganized, and verified the technical content.
Editing and consistency passes. On existing chapters we asked the models to suggest reorganizations, tighten phrasing, catch inconsistencies in terminology, and flag places where cross-references were stale. These were suggestions, not changes.
Code example review. Worked examples in Python, SQL, and shell were run or mentally executed by the human authors regardless of where they came from. When the AI wrote code, the human authors verified behavior and edited for clarity.
Brainstorming and outlining. Before any prose was written for a gap chapter, we sometimes used Claude or ChatGPT as a brainstorming partner: “What are the seven things a novice data scientist gets wrong about CSVs?” The outputs informed our outlines but never replaced editorial judgment about what belonged in the book.

B.3 What the AI was not asked to do

Some decisions we explicitly kept out of AI hands:

The book’s overall scope and table of contents. The human authors decided what parts the book would have and what the canonical chapter structure would be. Claude proposed candidate gap chapters during a plan-mode session, but the final choice of which to write, which to defer, and how to order them was ours.
Claims about specific authors, papers, or attributions. Bibliography entries were curated and verified by the human authors. When the models invented a plausible-sounding citation, we removed it. See the section on what we got wrong.
Factual claims about specific tools’ behavior. We cross-checked any claim of the form “X does Y” against the tool’s documentation before shipping it. This was particularly important for pandas method signatures, pytest features, SQL keyword support, and HTTP status codes, where the model’s training data may reflect an older version of reality.
Representation of our own experiences as teachers. The anecdotes and observations about “students in our courses” are from the human authors’ actual experience. When the model drafted a chapter that included a “students often…” claim, we either verified the claim against our experience or cut it.
Judgments about pedagogical tradeoffs. “Should we teach conda or venv first?” is a judgment call that requires knowing our students. We made those calls ourselves.

B.4 Why we think this is OK

The case for using AI tools in authoring a textbook — especially a textbook about computing practices — comes down to three things:

Speed matters in a fast-moving field. Computing tooling changes every year. The marginal cost of a well-placed chapter on virtual environments or pre-commit hooks is lower with AI assistance, which means more chapters get written, fewer gaps linger in the book, and students get better coverage of topics that would otherwise be deferred forever.
We retain editorial responsibility. A draft is not a decision. Every paragraph in this book was read by a human author with discretion to rewrite, reorganize, or delete. The AI is a fast junior writer, not a co-author; the buck stops with us.
Transparency is the right norm. Many textbooks use AI tools and do not say so. We would rather disclose than let students guess. If our disclosure turns out to be more generous than our peers’, so much the better — it gives readers a clear basis to evaluate our choices.

B.5 What we got wrong, and what we watched for

Honesty requires admitting the failure modes we hit. In rough order of how often:

Plausible-sounding but incorrect API details. The models would confidently describe a parameter to pd.read_csv or a pytest feature that did not exist in the version we targeted, or that had been renamed. Every code example in the book was checked against documentation or executed.
Fabricated citations. Once or twice, we asked for “a paper on X” and received a plausibly-formatted reference that did not exist. Every bibliography entry in references.bib corresponds to a real source that one of the human authors verified.
Over-confident generalizations. Draft prose sometimes made sweeping claims like “most data scientists do X” or “this is the standard approach,” where “most” was the model’s guess rather than a substantiated fact. We softened or removed such claims.
Lost voice and tone drift. A chapter drafted in a single pass would sometimes slip from our “friendly guide” tone into something more formal or generic. Editing passes by the human authors restored the voice.
Uneven depth. The model sometimes gave equal weight to a trivial topic and a critical one. Re-outlining and shortening or expanding sections was a common human edit.

These are not disqualifying problems — they are workflow problems that we solved with editing. But they are worth naming so that a reader can calibrate the risk.

B.6 What this means for students using the book

If you are a student reading this book to learn computing, three things are worth knowing:

The code examples have been checked by humans and either executed or traced mentally. If you find one that does not work, please open an issue — see CLAUDE.md for how.
The technical claims have been checked against documentation. We expect occasional errors to remain (nothing this long is perfect) and we will fix them as they are reported.
The pedagogical judgments — what to teach, in what order, with what emphasis — are the human authors’ choices, informed by teaching students like you.

You should apply the same skeptical habits we describe in Chapter 35 and Chapter 38 when you read this or any other text, regardless of how it was authored.

B.7 Discussion: why we disclose at all

“Does it matter?” is a reasonable question. A textbook is supposed to be correct and clear; what does the provenance of individual sentences have to do with that?

We think it matters for three reasons, which correspond to three different audiences.

For students. One of the skills this book is trying to build is a literate relationship with AI tools: when to use them, how to verify their output, how to read around their mistakes, and how to disclose them when they are part of your own work (see Chapter 35). If we failed to model that ourselves, the advice in those chapters would be hypocritical. Disclosure is the worked example for the standard we are asking students to adopt.

For instructors. An instructor adopting this book for a course has a right to know what kind of artifact they are teaching from. Some instructors will be more comfortable with this than others. We want to make it possible for them to make that choice knowingly, rather than quietly.

For the broader academic community. Norms around AI authorship are still forming. Journals, publishers, and universities are all working out their own disclosure requirements, and the resulting guidelines are inconsistent and sometimes contradictory. By writing down what we did and why, we contribute one data point to that ongoing conversation. We do not expect our practices to become the standard, but we would rather be on the record with a defensible position than hope nobody asks.

B.8 A note on the code we ship

A separate but related question is how AI tools were used in any software we publish alongside the book — currently the cleanup script referenced in CLAUDE.md and the _quarto.yml / CI workflow configuration. Those were drafted with AI assistance and then tested: the conversion script was run end-to-end on every chapter, and the render pipeline was validated by a clean HTML build with zero warnings. The same “draft, verify, own the result” discipline applies.

If this book grows to include executable code cells (see Chapter 16) or tests that ship with the book itself, the same standard will apply to them: AI may draft, humans verify and take responsibility.

B.9 How we will keep this appendix current

AI tools change. The list of tools we used, the ways we used them, and the kinds of mistakes they made are all moving targets. When we revise the book — adding a chapter, updating an example, responding to reader feedback — we will also update this appendix to reflect the new state of practice.

If a reader notices a substantive gap between what this appendix says and what the book obviously contains, that is a bug and we would like to hear about it.

B.10 Further reading

Students interested in the broader conversation about AI and scholarly authorship might start with:

The chapters Chapter 35, Chapter 36, Chapter 37, and Chapter 38 in this handbook itself.
Your own institution’s policies on AI assistance in academic work, which are almost certainly stricter in the student-assignment context than ours are in the textbook-authoring context. Do not assume our disclosure gives you cover for your own assignments.
The editorial policies of the journals and publishers in your field — many now require authors to disclose AI use, and they disagree about what counts as sufficient disclosure.
Prior essays and position papers on the topic, which are voluminous and moving quickly. Use your library rather than a search engine for the authoritative ones.

B.11 Checklist for your own work

If you ever need to write a disclosure like this one for your own course work or research, these are the questions we found useful to answer:

Which models or tools did you use?
What tasks did you use them for? Be specific — “drafting,” “editing,” “brainstorming,” “code completion,” “translation.”
What tasks did you explicitly not use them for?
What verification did you do on the output?
What mistakes did the tools make, and how did you catch them?
Who takes responsibility for the final result? (Usually: you.)

A disclosure of that shape answers the reasonable questions a reader might have and signals that you understand the tools well enough to use them responsibly.