The Mixed Ledger
Hoppy steps through the first gate of the archivist’s gauntlet and finds two things spread across the desk: one ledger page written in a noisy way, and one shelf index stored as a neat record file. Either piece alone is manageable, but if they need to work together, single isolated tricks are no longer enough.
That is the opening feeling of Chapter 6: not learning one more new button, but putting your earlier string, text, record, and small-structure skills onto the same task line.
Combined tasks often mean “clean text first, then enrich it with records”
In a slightly more realistic small data task, one kind of material is rarely enough. You may first pull key fields out of one messy text line, then use a structured record to add more information, and finally gather everything into a clearer result.
sample_line = " ## token=glow~dust | shelf=s1 | status=ready?? "
sample_index = {
"s1": {"keeper_name": "Pip", "room_name": "Moss Hall"}
}
cleaned_line = sample_line.strip().replace("## ", "").replace("~", " ").replace("??", "")
parts = cleaned_line.split(" | ")
token_name = parts[0].split("=")[1]
shelf_code = parts[1].split("=")[1]
keeper_name = sample_index[shelf_code]["keeper_name"]
preview = {
"token_name": token_name,
"keeper_name": keeper_name
}
print(preview)
This example only shows the action chain of “clean one text row, then use one record file to enrich it.” It is not the full answer to today’s starter. In the real task, you will organize all three ledger rows and gather them into a more structured archive result.
Today’s task: turn a mixed ledger into one archive result
The starter already reads mixed_ledger.txt and shelf_index.json for you. Your job is to finish two small tools and then connect them: complete clean_line(raw_line), complete build_entry(cleaned_line), and then build cleaned_lines, organized_entries, ready_entries, and ledger_summary.
All three ledger rows carry the same noise: the "## " prefix, cramped "~" marks, and a trailing "??". Put that cleanup sequence into clean_line(raw_line) first.
Inside build_entry(cleaned_line), first use split(" | ") to break the row into three pieces, then use split("=") to read item_name, shelf_code, and status.
The text row alone only tells you the item, the shelf code, and the status. Use shelf_index[shelf_code] to add keeper_name and room_name, so the result becomes a fuller record.
Build an organized_entries list from the three cleaned rows, then filter out ready_entries, and finally gather ledger_summary. That is the real feeling this lesson wants to create: text, records, and result organization now work together.
This lesson does not introduce a new concept, and it does not throw you into a large project. We are only pulling earlier moves into the same small task so you can clearly feel: I am no longer doing isolated drills — I am using the whole Series to get something done.
Suggested SolutionExpandCollapse
import json
with open("mixed_ledger.txt", "r", encoding="utf-8") as file:
ledger_text = file.read().strip()
with open("shelf_index.json", "r", encoding="utf-8") as file:
shelf_index = json.load(file)
print("Ledger text:")
print(ledger_text)
print("Shelf index:", shelf_index)
ledger_lines = ledger_text.splitlines()
print("Ledger lines:", ledger_lines)
def clean_line(raw_line):
return raw_line.strip().replace("## ", "").replace("~", " ").replace("??", "")
def build_entry(cleaned_line):
parts = cleaned_line.split(" | ")
item_name = parts[0].split("=")[1]
shelf_code = parts[1].split("=")[1]
status = parts[2].split("=")[1]
shelf_record = shelf_index[shelf_code]
keeper_name = shelf_record["keeper_name"]
room_name = shelf_record["room_name"]
return {
"item_name": item_name,
"shelf_code": shelf_code,
"status": status,
"keeper_name": keeper_name,
"room_name": room_name,
}
cleaned_lines = [
clean_line(ledger_lines[0]),
clean_line(ledger_lines[1]),
clean_line(ledger_lines[2]),
]
organized_entries = [
build_entry(cleaned_lines[0]),
build_entry(cleaned_lines[1]),
build_entry(cleaned_lines[2]),
]
ready_entries = [
entry
for entry in organized_entries
if entry["status"] == "ready"
]
ledger_summary = {
"entry_count": len(organized_entries),
"ready_count": len(ready_entries),
"first_keeper": organized_entries[0]["keeper_name"],
"last_room": organized_entries[2]["room_name"],
}
print("Cleaned lines:", cleaned_lines)
print("Organized entries:", organized_entries)
print("Ready entries:", ready_entries)
print("Ledger summary:", ledger_summary)Advanced TipsWant more? Click to expandClick to collapse
The most important change in this lesson is not that the code is longer. It is that you are now combining several older abilities at once: clean text, split fields, look up records, and organize the result into clear output.
That is the beginning of the archivist’s gauntlet. The next lesson will keep combining skills too, but the focus will still be on choosing structures well and arranging the steps steadily — not on adding new knowledge.