Skip to main content
Digital Fluency

Digital Fluency Platform

Product specification

Objective

Enable low-to-mid digital users to:

Target: move users from PIAAC Level 1 (single-step tasks in a generic interface) to Level 2+ (multi-step problem-solving with inferential reasoning), and demonstrate far transfer: success on novel-context tasks that share structure but not surface form with training tasks.

The product surfaces, AI co-pilot behavior, and assessment metrics in this spec are direct consequences of the pedagogical commitments in pedagogy.md. The architectural choices are detailed in technical-approach.md.


Target user

Can:

Cannot reliably:

Mental-model gaps (the deeper constraint)

Per the Urban Institute's 2019 finding1 and the conceptual-change literature2, the gating issue for adult digital fluency is not UI familiarity but mental models. Smartphone-fluent users lack:

The curriculum must construct these. Procedural training over them produces sandbox-procedural users who fail in the wild. See pedagogy.md §3.


Core product concept

A task-based learning system inside an instrumented simulated desktop, with an embedded AI co-pilot that observes the user's work and intervenes pedagogically.

The simulated environment is a deliberate architectural choice driven by latency, telemetry richness, and reliability constraints — see technical-approach.md §1.


System overview

1. Simulated desktop (in-browser)

A simplified OS-like environment with:

Constraints:

2. Task engine

Each task:

Examples (these are placeholder content — final task content is shaped by fieldwork.md Phase 1):

3. AI co-pilot

Persistent side panel. Capabilities:

Rules (see pedagogy.md §4 and technical-approach.md §4):

4. Assessment engine

Tracks two distinct categories of metric:

Near-transfer (procedural fluency):

Far-transfer (schema application): (headline metric)

Outputs:

The near/far distinction is non-negotiable. A system that only measures near-transfer will produce sandbox-procedural users — see pedagogy.md §6 (falsification).


Transfer Design Principles

The five evidence-supported moves from pedagogy.md, translated into operational product constraints:

TDP-1: Cross-domain task families

Each skill is practiced in at least three different surface forms before being marked "taught." Not three email tasks — one email task, one form task, one document task that share the same underlying pattern. The variation across surfaces produces low-road transfer; the consistency of pattern builds the schema.

Implementation: Curriculum-engine constraint. The task selection algorithm cannot mark a pattern "mastered" on a single surface form, regardless of completion success.

TDP-2: Explicit pattern naming

On task subgoal completion, the co-pilot names the pattern the user just used, in the curriculum's controlled vocabulary, with one example of where the pattern recurs.

Implementation: A defined trigger in the co-pilot orchestrator (task-complete trigger). The pattern vocabulary is a small, stable set (~12–20 patterns) defined as part of curriculum design, not invented per-response.

TDP-3: Metacognitive debrief

At task end, the co-pilot asks one question: "What pattern did you use? Where else might it apply?" The user articulates the schema in their own words, the co-pilot confirms or refines.

Implementation: A defined trigger in the co-pilot orchestrator (task-end trigger). Frequency limit: one debrief per task. Skipped if the user shows signs of metacognitive overload (per McCarthy 2018 — see pedagogy.md §4).

TDP-4: Contrasting cases

Periodically (~every 4–6 tasks), the system inserts a near-neighbor task that looks similar to the previous one but requires a different pattern. The co-pilot prompts: "Is this the same pattern as last time, or different? How can you tell?"

Implementation: Task-selection logic. The contrasting-case trigger fires when the system has just completed a task with primary pattern X, and a near-neighbor task with primary pattern Y exists.

TDP-5: Far-transfer assessment

Assessment cycles include tasks the user has never seen, in surface forms they've never trained in, that require previously-taught patterns.

Implementation: The Assessment Engine reserves a held-out pool of "novel surface form" tasks per pattern. These are never used as training tasks for any user.


Curriculum structure

Five levels. Each progresses on the kind of mental-model construction it requires, not procedural difficulty alone. Constraint per TDP-1: each skill must appear in ≥3 distinct surface forms before being considered taught.

Level Goal Mission alignment
L1 — Operational basics Construct the desktop mental model: file persistence, multi-window state, basic hierarchy. Bridge level. Probably not where our distinctive contribution lives — public libraries already deliver this for free, with in-person support, at scale. v1 likely treats this as a fast-track diagnostic or partners with library Level-1 onboarding.
L2 — Transactional tasks Sequencing and verification. Multi-step processes within one or two apps; recovery from errors. Bridge level. Some library systems handle this; many don't. Our value-add starts here.
L3 — Workflow execution Operate across applications. Compose Level 1–2 patterns into longer workflows that span apps and sustain state. Where our core mission begins. The library system does not scaffold this depth. The Urban Institute (2019) names this as the gap providers can't fill.
L4 — AI-augmented workflows Direct AI as a tool inside a workflow. Evaluate AI output. Refine. Novel. No established curriculum exists at this level for adult digital-fluency populations.
L5 — Adaptation layer Encounter an unfamiliar tool, recognize patterns, figure it out. The closest thing the platform certifies to "I am now digitally fluent." Most aligned with the pitch's thesis. Tools change; AI agents proliferate; the only durable skill is the ability to meet new ones.

Full curriculum content — task families, patterns, mental-model tracks, the controlled pattern vocabulary the AI co-pilot uses — lives in curriculum.md. That document also flags the strategic question of where v1 should target on this scale (see Implications below).

Cross-cutting: mental-model construction tracks

Independent of the procedural-skill progression, the curriculum tracks four mental-model constructions:

The Assessment Engine tracks each as a separate mastery dimension.


UX principles


MVP scope

Open question: where does v1 target on the curriculum?

The previous draft of this spec scoped v1 to "Curriculum Levels 1–2 (~10–15 tasks total)." Drafting curriculum.md surfaced a problem with that framing: Level 1–2 work is largely already served by the public library system (free, in-person, with thirty years of operational experience). A v1 that targets the same competencies competes with established infrastructure and betrays the pitch's "transferable schemas in the age of AI" thesis, because the novel content lives at Levels 3+.

Two alternatives we are considering for v1 — to be decided jointly with Phase 2 of fieldwork.md:

  1. Levels 2–3, Level 1 as a fast-track diagnostic. Users with smartphone fluency (most of the target population) skip Level 1 in 10–20 minutes; users without it get routed to library/Northstar acquisition before re-entering. Targets the real gap libraries don't fill.
  2. Levels 3–4 only, library/ABE partnership for Levels 1–2. Sharper. Concedes the basics to established infrastructure and concentrates engineering on the novel-contribution levels. Distribution requires a partner — fieldwork.md is set up to identify one.

v1 features (regardless of curriculum scope):

Not in v1:

Detailed build phases in technical-approach.md §7.


Key metrics

Primary (headline):

Secondary:

Diagnostic (for our team, not the user):


Risks and mitigations

Pedagogical risks

Sandbox-procedural failure — users become competent in our environment but cannot transfer to real apps. → Mitigated by TDP-1 (≥3 surface forms per skill), TDP-5 (far-transfer assessment), and the falsification metrics in pedagogy.md §6.

AI dependency — users learn to ask the co-pilot rather than to think. → Mitigated by attempt-before-assist (productive struggle), graduated escalation in help, and tracking AI-usage patterns as a chat-quality metric, not just volume.

Metacognitive overload — debrief prompts demoralize low-confidence users. → Mitigated by McCarthy-2018-aware design (one debrief per task max, scaffolded modeling first, skip if affective signals indicate distress). See pedagogy.md §5.

Engagement risks

Novelty effect decay — Bastani's mediator wanes after the first sessions. → Mitigated by the engagement-architecture decision (deferred, see open questions). Field research probably surfaces a cohort/community layer requirement.

Distress vs. productive struggle confusion — the co-pilot intervenes too late or too early. → Mitigated by field research with experienced instructors (fieldwork.md Q2). The v1 trigger thresholds are educated guesses; calibration comes from observation.

Distribution and access risks

The PIAAC Level-1 population is the least likely to find a D2C learning product. → Distribution channel is unspecified. See pitch-and-overview.md "Distribution channel" TODO.

Third-level digital divide — users without home internet or a device cannot practice between sessions; OECD data shows skills decline without sustained practice. → Out of scope for v1 product; partially addressed by partner-distribution choice (libraries provide access).

Competitive risks

Khan / Google / Microsoft can ship a competitor in a quarter if this works. → Mitigated by (a) the population specificity (incumbents have not targeted PIAAC Level-1 adults), (b) the pedagogy specificity (transfer-targeted design is a non-trivial intellectual commitment), (c) the open-source platform layer strategy (long-term defensibility comes from being the standard, not from the moat).


Technical notes

Architecture, telemetry layer, AI integration, model choices, state estimation, cost model, and build phases: technical-approach.md.

Headline points:


Distribution and pilot plan

TODO (Matt): Distribution channel commitment + first-cohort plan.

Field-research program (fieldwork.md) is designed to produce 2–3 named candidate partner organizations as a deliverable. Defer commitment until Phase 2 of that program completes.

Likely shape (subject to revision):

  • Partner with one library system or ABE provider for the first 100–500-user pilot.
  • Co-design content for a cohort with the partner organization (their staff knows their learners; co-design has compounding pedagogical value).
  • Partner organization runs in-person component (per Urban Institute: "humanizing" matters); product runs the AI-coached interactive component.
  • Pilot success criterion: defined jointly with partner against far-transfer rate and engagement-retention thresholds.

Long-term expansion


End state

A system that:

Footnotes

  1. Hecker & Loprest, Foundational Digital Skills for Career Progress, Urban Institute 2019. Notes: research/grey-literature/urban-institute-2019-foundational-digital-skills.md.

  2. Synthesis of conceptual-change literature (Chi 2008 framework, NN/G mental-model studies). See research/summaries/adult-ct-and-digital-skills-transfer.md and pedagogy.md §3.