Software Engineer Track
Software Engineer
Ships features, writes quality code, designs systems. Two tracks from L4+: Engineering Management or Individual Contributor (Principal Engineer / Distinguished Engineer).
5levels
·
8competencies
·
Total weight 11.5
·
Hire bar L3
The 5 SE levels
L1
Junior SE
You ship:
1–2 small tasks per sprint
📚Reads code, asks senior often
⚡Ships small PRs with review
🎯Follows existing patterns
1.0 – 1.8
L2
Middle SE
You ship:
Features end-to-end, solo
🛠Builds features without supervision
👀Reviews L1 PRs constructively
🔁Iterates fast with feedback
1.9 – 2.7
L3
Senior SE
You ship:
A module, end-to-end. Hire bar.
🏗Owns a module end-to-end
📐Designs at feature level
🧑🏫Mentors L1–L2 daily
2.8 – 3.6
L4
Tech Lead · 2 tracks
You ship:
System designs · team or IC track
🧭Architects systems across services
👥Leads team (Tech Lead) or codes deep (Principal)
🤝Bridges client + team on hard calls
3.7 – 4.4
L5
SA · capped seat
You ship:
Cross-ODC strategy · 0–2 seats only
🌍Spans multiple teams, departments
📜Sets engineering standards
🎯Strategic decisions, board-level risk
4.5 – 5.0
Your journey:
L1
L2
L3
L4
L5
Behaviors, not tenure. See the detail below.
Competency matrix — at a glance
Competency
W
L1
L2
L3
L4
L5
Technical Mastery
×2
2
3
4
5
5
Code Quality
×1.5
1
3
3
4
5
System Design
×1.5
1
2
3
4
5
AI Tools & Modern Eng
×1.5
1
2
3
4
4
Problem Solving
×1.5
1
2
3
4
5
Communication
×1.5
1
2
3
4
4
English Proficiency
×1
1
2
3
4
4
Mentoring & Tech Leadership
×1
1
2
3
4
4
1 Aware2 Developing3 Proficient · hire bar4 Senior5 Expert
Competency deep-dive
Technical Mastery
Weight ×2
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"I learn by doing. I ask for help when stuck."
Knows one programming language well enough to follow patterns. Can complete small, well-defined tasks given to them. Needs guidance on anything non-trivial.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Submits PRs reviewed before merge | ≥3 PRs/week, 100% reviewed by senior | GitHub PR author + review log |
| Reads code more than writes (first 3 months) | ≥2 code-reading sessions logged/week with mentor | Confluence learning log + mentor 1:1 notes |
| Asks senior for help when stuck | Several questions/day in first 3 months, ≤30 min stuck before asking | Slack mentor thread + 1:1 notes |
| Production bugs from own code | 0 production bugs in first 3 months (mentor catches before merge) | Jira bug origin tagging + Git blame |
| Completes onboarding tech glossary / stack tutorials | 100% within 60 days of join | HR onboarding tracker + Confluence checklist |
Observable behaviors
- 1Writes code that follows existing patterns in the codebase.
- 2Completes well-scoped tasks (CRUD, small features) given clear instructions.
- 3Asks the Tech Lead or L3 mentor when stuck (several times per day in the first 3 months).
- 4Reads code more than writes it in the first 3 months.
- 5Knows basic data structures and language syntax solidly.
- 6Limited debugging — relies on stack traces and Google.
Expected outputs
- Pull requests for small, well-defined tasks (CRUD endpoints, UI components).
- Unit tests for own code (when explicitly required).
- Daily report on work-in-progress.
- Questions and notes documented to the mentor.
What you would see
In code review, the L1 PR shows correct logic but inconsistent with codebase patterns. The L3 reviewer comments on naming, style, and one missed edge case. The L1 fixes them and asks two follow-up questions.
Common traps
- Does not ask for help and silently produces wrong work for a week.
- Copy-pastes from Stack Overflow or AI without understanding why it works.
L2
"I own my features. I deliver the majority of the team work."
Independently delivers most features in the sprint. Understands the whole codebase at a structural level. Writes code others can read and maintain.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Ships features end-to-end solo on primary stack | ≥1-2 features/sprint solo without TL handholding | Jira feature assignee + PR author |
| Debug time on production issues in own modules | Root cause within hours (≤8h) for routine bugs | Jira bug timestamps (open → root cause comment) |
| Second stack/framework pickup | Productive in 2nd framework within 2-4 weeks | PR contribution log on 2nd-stack repo |
| Reads architecture diagrams + tech docs unaided | ≥90% design docs reviewed without TL walkthrough | GitHub PR review log + Confluence comments |
| AI/tooling productivity uplift on routine work | ≥20% time saved vs baseline (boilerplate, tests) | AI usage log + sprint velocity self-report |
Observable behaviors
- 1Independently delivers medium-complexity features (a full screen + backend + tests).
- 2Reads and modifies code across the full stack of the project.
- 3Debugs production issues using logs, traces, and step-through.
- 4Refactors small areas of the code without supervision.
- 5Reviews L1 code; gives constructive comments.
- 6Comfortable with the team primary stack; can pick up a second framework in 2-4 weeks.
Expected outputs
- Pull requests for medium-complexity features (most of the sprint stories).
- Unit tests and integration tests for own work.
- Code review comments on L1 PRs (5+ per week).
- Bug fixes traced to root cause, not just symptom.
- Daily and weekly reports.
What you would see
Given a new story, the L2 reads the AC, looks at the existing code, asks the BA one clarifying question, then ships a working PR within the sprint without check-ins.
Common traps
- Comfortable on the rails — struggles when a problem requires deviating from the existing pattern.
- Treats code review as a formality — accepts L1 PRs without challenging.
L3
"I deliver the hard parts. I guide the team. I represent the engineering quality of the team to the client."
Takes on the hardest technical work of the project. Designs solutions at the feature level. Mentors L1 and L2. Communicates directly with the client Tech Lead.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Fluent in 2+ languages across paradigms | ≥2 languages fluent with paradigm difference (e.g. OOP + functional) | PR contribution log across language repos |
| CODEOWNERS on at least one subsystem | Listed as CODEOWNERS for ≥1 subsystem, reviews 100% PRs in scope | GitHub CODEOWNERS file + review log |
| Independent RCA on production incidents | Root cause analysis completed <4h solo on own subsystem | Incident postmortem authorship + timestamps |
| Owns module-level technical decisions | ≥1 module-level decision/quarter authored solo (library choice, refactor) | Confluence ADR + PR series |
| Cross-stack debugging without escalation | ≥80% cross-stack bugs resolved without TL/L4 escalation | Jira escalation log + resolver field |
Observable behaviors
- 1Delivers the hardest, most complex features (integrations, performance, security-sensitive).
- 2Designs solutions at the feature level — picks the right libraries, patterns, data structures.
- 3Reviews PRs from L1 and L2 — provides design-level feedback, not just style.
- 4Pair-programs with L1 and L2 to teach, not just to deliver.
- 5Owns at least one non-trivial subsystem end-to-end.
- 6Engages directly with the client Tech Lead on technical decisions.
- 7Spots architectural inconsistencies and proposes fixes.
Expected outputs
- Pull requests for high-complexity features.
- Technical design notes for medium-large features.
- Code review feedback on L1 and L2 PRs (10+ per week with at least 3 design-level comments).
- Bug investigation write-ups with root cause analysis.
- Sprint and feature-level technical reports.
What you would see
When the team hits a complex production issue, the L3 drops into a debug session, identifies the root cause within hours (not days), and writes up the fix plan that the team executes.
Common traps
- Becomes the bottleneck — every hard problem routes through them, juniors do not grow.
- Refuses to delegate; takes on too much; quality dips from overload.
L4
"I am responsible for the team technical quality and growth — not just my own code."
Like L3 but with broader scope. Sets the technical direction for a sub-team or small ODC. Decides hiring and replacement. Evaluates L1-L3 engineers. Owns backlog control and sprint quality.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Languages fluent across paradigms (sets stack choice) | ≥3 languages fluent, sets stack for ≥1 ODC project/year | Confluence ADR + tech stack doc authorship |
| Owns CI quality gates + AI workflow for sub-team | Owns ≥1 CI pipeline + Skills/Constitution doc | GitHub CODEOWNERS + Confluence Skills doc |
| Distributed-system debugging (race, leak, cross-service) | ≥3 cross-service/perf incidents resolved/year as RCA lead | Incident postmortem authorship |
| Audits team codebases for tech debt | ≥2 codebase audits/quarter with documented findings | Confluence audit report |
| Mentors L3s on systems foundations (DB internals, OS, network) | ≥2 L3s with quarterly coaching log | Confluence mentorship log |
Observable behaviors
- 1Sets the technical direction (stack choices, patterns, conventions) for the team or ODC.
- 2Reviews and evaluates L1, L2, L3 engineers (formal performance assessment).
- 3Participates in technical interviews; recommends hire or no-hire with rationale.
- 4Recommends replacements when engineers are not performing or wrong-fit.
- 5Owns backlog technical control — splits stories, sizes correctly, flags hidden risk.
- 6Owns sprint quality gate — no PR ships without their or a designated reviewer approval.
- 7Coaches L3 engineers toward becoming Tech Leads themselves.
- 8Coordinates with multiple sub-team Tech Leads (cross-team patterns, shared infrastructure).
Expected outputs
- Technical direction documents for the team (stack choices, key decisions).
- Engineering performance assessments for L1, L2, L3 reports.
- Hiring scorecards and recommendations.
- Replacement recommendations with replacement plan.
- Backlog technical reviews — sizing notes, splits, risk callouts.
- Sprint quality reports.
- Monthly engineering reports to ODC Lead.
What you would see
When a team member is underperforming, the L4 has already had structured 1:1s, set a 30-day improvement plan with the engineer, and brings a clear recommendation (improve / pip / replace) to the ODC Lead — not waiting for HR to ask.
Common traps
- Loses individual contributor edge — stops coding entirely and becomes irrelevant on technical debates.
- Overcorrects the other way — keeps coding so much that hiring, evaluation, and coaching get neglected.
L5
"I control the technology and architecture across all projects in the department. I think in systems, not in tickets."
Designs systems used across multiple ODCs. Consults on technology upgrades. Interviews and evaluates senior candidates. Leads presale technical conversations. Reports to the Delivery Manager.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Sets D3 tooling / AI philosophy used as Techvify template | ≥1 philosophy doc adopted by ≥2 ODCs | Confluence philosophy doc + adoption record |
| Consults on system upgrades — migrations, modernizations | ≥3 cross-ODC consultations/quarter | Slack/email consultation log + ADR contribution |
| NFR diagnosis at architecture level (perf, scale, reliability) | ≥2 NFR-level diagnoses resolved/year across ODCs | Incident postmortem + architecture review notes |
| Sets depth bar for unit (RFCs, papers, systems thinking) | ≥1 technical reading group / paper presentation/quarter | Confluence reading group log |
| Coaches L4 Tech Leads on systems thinking | ≥2 L4s coached sustained 2+ quarters | Confluence L4 mentorship log |
Observable behaviors
- 1Designs systems for one or more ODCs (architecture, integration patterns, NFRs).
- 2Consults on partial or full system upgrades (migrations, modernizations).
- 3Interviews and evaluates senior engineering candidates (L3+).
- 4Owns presale technical conversations — proposes solutions, defends architecture choices to prospects.
- 5Files quarterly reports on department technology health.
- 6Contributes to department planning direction.
- 7Joins specific projects when requested (selective deep-dive).
- 8Syncs regularly with ODC Leads and Tech Leads (no formal reports but high coordination).
Expected outputs
- Architecture documents for ODC systems.
- Modernization and upgrade consultation reports.
- Senior interview evaluations.
- Presale technical proposals.
- Quarterly department technology report.
- Planning direction inputs.
What you would see
Called into a presale meeting with a Fortune-500 prospect. The SA listens to the client pain points for 20 minutes, asks 5 sharp questions, and sketches an architecture on the whiteboard that the client signs off on the same day.
Common traps
- Becomes too theoretical — loses ground truth on what actually ships.
- Imposes architecture choices on teams without buy-in.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Copy-pastes from Stack Overflow or AI without understanding.
Treats code review as formality. Accepts L1 PRs without challenge.
Bottleneck — every hard problem routes through them. Juniors don't grow.
Stops coding entirely. Loses IC edge, irrelevant in technical debates.
Too theoretical. Imposes architecture without team buy-in.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
Code Quality & Engineering Discipline
Weight ×1.5
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"Quality is what passes review."
Writes unit tests when explicitly required. Follows the team style guide. Quality is whatever the reviewer says it is.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Writes unit tests when asked, follows team standard | ≥60% coverage on own new code, 100% PRs include tests | Codecov / SonarQube PR report |
| Fixes review comments without re-explaining | ≥80% comments resolved within 2 review rounds | GitHub PR review thread analysis |
| Follows team conventions with reminders OK | 100% PRs follow lint/format rules (CI green before review request) | CI lint/format check status |
| Defers severity / quality calls to senior | 100% quality-gate decisions reviewed before merge | GitHub review approval log |
| Bugs flagged by reviewer/QA fixed with regression test | ≥80% fixes include regression test | Jira bug fix PR + test diff |
Observable behaviors
- 1Writes unit tests for own code when asked.
- 2Follows team coding conventions (with reminders).
- 3Fixes review comments on second or third try.
- 4Does not yet think about edge cases proactively.
Expected outputs
- Unit tests for own features.
- Code that passes review (eventually).
What you would see
PR ships with 60% test coverage on the new code. Reviewer flags 2 missing edge cases — L1 adds them.
Common traps
- Treats tests as a checkbox, not a safety net.
- Believes “it works on my machine” is enough.
L2
"I test as I write. I review my own PR before requesting review."
Writes unit and integration tests as part of the work, not after. Self-reviews every PR before submitting. Defects on their code are rare.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Test coverage by default (not bolted on) | ≥80% coverage on own code (unit + integration) | Codecov / SonarQube per-PR report |
| Self-reviews PR before requesting review | 100% PRs have self-review comments before assignment | GitHub PR self-comment log |
| Catches own bugs in self-review (defect escape) | <8% defect escape to QA on own code | Jira bug origin vs QA-found tag |
| Reviews L1 PRs for correctness + style | ≥3 L1 PRs reviewed/week with substantive comments | GitHub review comment count + depth |
| Refactors small areas during own feature work | ≥20% PRs include refactor commit alongside feature | Git commit message tags (refactor:/feat:) |
Observable behaviors
- 1Writes unit and integration tests as part of the feature, not bolted on after.
- 2Self-reviews own PR before requesting reviewers.
- 3Achieves 80%+ test coverage on new code by default.
- 4Catches their own bugs in self-review (visible from squashed commits).
- 5Reviews L1 PRs for both correctness and style.
Expected outputs
- Unit and integration tests with each PR.
- Self-review comments on own PRs (proof of discipline).
- Reviews of L1 PRs (multiple per week).
What you would see
PR has 85% test coverage, the author has self-reviewed and left two comments fixing minor issues before requesting review.
Common traps
- Reviews L1 code superficially (typo-spotting only).
- Skips integration tests when time is tight.
L3
"Quality is engineered into the design, not added at the end."
Drives the team quality bar. Writes test strategies, not just tests. Reviews PRs at design level. Owns defect investigation.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Fully autonomous code delivery (no peer support needed) | 0 instances requiring peer code support for delivery in last quarter | Slack help-channel audit + Jira assignee log |
| Test coverage on own code | ≥85% coverage on own code (unit + integration) | Codecov / SonarQube per-PR report |
| PR turnaround time | <24h from PR open to merge on own PRs | GitHub PR metrics dashboard |
| Defect escape + refactor discipline + zero shortcuts | <3% defect escape, ≥35% PRs include refactor, 0 force-merges, 0 quality gate skips | Jira defect escape tag + Git commit audit + CI gate log |
| Reviews peer PRs with quality lens | ≥5 peer PRs reviewed/week with quality-focused comments | GitHub review comment audit |
Observable behaviors
- 1Designs the test strategy for a new feature (unit / integration / e2e mix).
- 2Reviews PRs at design level — catches over-engineering, missed abstractions, hidden coupling.
- 3Investigates production defects to root cause; writes postmortem.
- 4Maintains the team quality gates (definition of done, CI requirements).
- 5Refactors aggressively when seeing tech debt.
Expected outputs
- Test strategy documents for major features.
- Design-level review feedback on L1 and L2 PRs.
- Postmortem documents for production defects.
- Refactoring PRs that reduce code complexity (visible in metrics).
What you would see
After a production incident, the L3 produces a 2-page postmortem within 48 hours: timeline, root cause, fix, prevention measures, and a follow-up PR adding a test to prevent regression.
Common traps
- Quality religion — refactors at the expense of shipping.
- Reviews become so deep that PRs sit waiting for days.
L4
"The team quality is mine. I set the bar and the standards."
Defines the team engineering standards. Owns the CI/CD pipeline quality gates. Measures and reports defect trends. Coaches L3s on quality leadership.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Production incidents from own code in last quarter | 0 production incidents from own code/quarter | Incident log + Git blame on root cause |
| Owns CI / quality bar for sub-team | Owns ≥1 CI pipeline + DoD + review checklist | GitHub CODEOWNERS + Confluence DoD doc |
| Tracks defect escape rate + MTTR for sub-team | Defect escape <3% sub-team, MTTR <4h on owned services | Engineering dashboard / Jira metrics |
| Coaches L3s on design-level reviewing | ≥2 L3s with documented review-coaching sessions/quarter | Confluence coaching log |
| Drives systemic fix initiatives (debt + resilience) | ≥2 systemic initiatives/year (e.g. flaky-test purge, retry standard) | Confluence initiative tracker + PR series |
Observable behaviors
- 1Defines engineering standards (code style, review checklist, definition of done, CI gates).
- 2Owns the CI/CD pipeline quality (test coverage thresholds, security scans, deployment gates).
- 3Tracks and reports defect escape rate, MTTR, deploy frequency.
- 4Coaches L3 engineers on quality leadership.
- 5Drives quality improvement initiatives (testing automation, observability, SRE practices).
Expected outputs
- Engineering standards documents.
- Quality dashboard (defect escape rate, MTTR, coverage trends).
- Quality improvement RFCs.
- Coaching notes for L3 engineers.
What you would see
Monthly engineering report shows defect escape rate trending down for 3 months because the L4 introduced contract testing for the most-failed integration.
Common traps
- Standards become bureaucracy — team slows down.
- Quality dashboard becomes a vanity exercise without action.
L5
"Quality is a property of the system design, the team capability, and the operational practices combined."
Sets quality standards at the department level. Designs systems with quality built in (testability, observability, NFR baselines).
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Department-level quality standards owner | Standards adopted by ≥80% engineers in department | Confluence standards doc + adoption survey |
| Reviews L4 architecture proposals for quality + resilience | ≥4 L4 architecture reviews/year as quality reviewer | ADR / architecture review sign-off log |
| Brings SRE + chaos engineering practices to ODC | ≥1 SRE/chaos practice introduced + adopted/year | Confluence practice adoption record |
| Department defect-escape benchmark + leaderboard | Updated monthly, covers all ODC teams | BI / Confluence quality dashboard |
| Drives modernization across department (legacy migration) | ≥1 migration program led/year | Confluence migration plan + completion record |
Observable behaviors
- 1Sets department-wide engineering standards.
- 2Designs systems with testability, observability, and NFR baselines.
- 3Reviews architecture proposals from L4s for quality and resilience.
- 4Brings industry quality practices (SRE, chaos engineering, etc.) into the department.
Expected outputs
- Department engineering standards.
- System architecture reviews focused on quality.
- Quality practice proposals.
What you would see
Designs the new payments service with explicit MTTR and availability SLOs baked into the architecture document, with a test strategy aligned to the SLOs.
Common traps
- Quality theater — beautiful standards, no operational reality check.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Tests as checkbox not safety net · 'works on my machine' mindset.
Superficial L1 reviews (typos only) · skips integration tests when busy.
Quality religion — refactors at expense of shipping · reviews so deep PRs sit for days.
Standards become bureaucracy · quality dashboard becomes vanity exercise.
Quality theater — beautiful standards, no operational reality check.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
System Design & Architecture
Weight ×1.5
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"Architecture is what the senior engineers decided."
Reads architecture diagrams. Understands what each component does at a name level. Does not yet design.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Reads diagrams + follows existing architecture | 100% PRs respect existing module boundaries (no cross-cuts) | PR review architecture-violation comments (=0) |
| Attends design / architecture reviews as observer | ≥80% attendance | Calendar + meeting attendance log |
| Asks before changing anything beyond own feature | 100% scope expansions raised with TL before PR | Slack TL thread + Jira scope-change log |
| Files learning notes on architecture sessions | ≥1 note/week on design sessions attended | Confluence learning log |
| Reads existing ADRs before touching subsystem | 100% subsystem changes preceded by ADR read (mentor verifies) | Mentor 1:1 sign-off |
Observable behaviors
- 1Reads existing architecture diagrams and follows them.
- 2Knows what each major component in the system does.
- 3Asks before changing anything beyond their feature.
Expected outputs
- Code that fits within the existing architecture.
What you would see
Asked to add a new API endpoint, L1 finds the right service, copies the pattern, and ships.
Common traps
- Treats architecture as someone else's job; does not try to understand it.
L2
"I design within a feature. I follow the bigger pattern."
Designs feature-level architecture (class structure, module boundaries, data flow within a feature). Understands NFR concepts.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Designs feature-level class structure / module boundaries | ≥1 half-page design note per >1-week feature | Confluence feature design archive |
| Performance/security/scalability awareness at conceptual level | ≥1 NFR question raised per design session | Design review meeting notes |
| Gets sanity check from L3 on non-trivial design | 100% non-trivial designs have L3 review trail | Slack/Confluence design review thread |
| Reads ADRs for context before contributing | ≥1 ADR comment / question per sprint | Confluence ADR comment log |
| Half-page feature design note before 1-week feature | ≥80% >1-week features have design note | Confluence design note vs Jira estimate |
Observable behaviors
- 1Designs class structure and module boundaries for a feature.
- 2Discusses feasibility and approach with L3 before writing.
- 3Understands NFRs (performance, security, scalability) at a conceptual level.
- 4Reads architecture decision records (ADRs) for context.
Expected outputs
- Feature-level design notes (a paragraph or a diagram).
- Approach proposals reviewed by L3 before coding.
What you would see
Before starting a 1-week feature, L2 writes a half-page design note and gets a sanity check from L3.
Common traps
- Over-designs small features.
- Or skips design and refactors twice.
L3
"I design systems at module and service level. I think about coupling, testability, evolution."
Designs modules and small services. Picks the right patterns. Considers evolution and operability.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Designs feature-level architecture solo | ≥1 feature design authored solo/quarter (multi-component scope) | Confluence feature design archive |
| Authors ADRs on non-trivial changes | 100% non-trivial changes have ADR authored by self | Confluence ADR author log |
| Diagrams for complex changes | ≥80% complex changes include sequence/component diagram | PR description + Confluence design doc audit |
| Reviews peer designs with NFR lens | ≥3 peer design reviews/quarter with NFR comments | Confluence design review log |
| Proposes refactors at module boundary level | ≥1 module-boundary refactor proposed/quarter with trade-off doc | Confluence refactor proposal archive |
Observable behaviors
- 1Designs new modules and small services end-to-end.
- 2Picks appropriate patterns (events, queues, sync vs async, stateful vs stateless).
- 3Considers evolution — how will this change in 12 months?
- 4Writes architecture decision records (ADRs) for non-trivial choices.
- 5Reviews L2 designs and gives feedback.
Expected outputs
- Module and service design documents.
- ADRs for non-trivial decisions.
- Reviews of L2 designs.
What you would see
When asked to add a new feature that touches 3 services, L3 produces a sequence diagram + ADR identifying the trade-off between consistency and latency, recommends the option, and the team executes.
Common traps
- Over-applies favorite patterns (everything becomes microservices).
- Does not validate design with the team — gets pushback late.
L4
"I own the technical direction for this ODC or sub-team. Cross-component decisions are mine."
Designs at system level for a sub-team or ODC. Cross-cuts services and components. Drives architecture evolution.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Designs system architecture across services for ODC | ≥1 cross-service architecture/quarter, ≥2 services involved | Confluence architecture doc + scope |
| ADRs adopted by other teams | ≥2 ADRs/quarter adopted by ≥2 teams | Confluence ADR + cross-team adoption record |
| Handles multi-region, failover, observability, cost trade-offs | ≥1 NFR-heavy design/year (covers ≥3 NFR dimensions) | Confluence NFR analysis doc |
| Critical NFR misses on own systems | 0 critical NFR misses (perf/sec/availability) in last 2 quarters | Postmortem NFR-cause tagging |
| Reviews L3 design proposals + drives deprecation | ≥4 L3 design reviews/quarter as primary reviewer | ADR / design doc review sign-off |
Observable behaviors
- 1Designs system-level architecture (multiple services + data flow + deployment).
- 2Cross-component decisions (consistency, eventual consistency, transaction boundaries).
- 3Drives architecture evolution — what is being deprecated, what is new, why.
- 4Coordinates with other Tech Leads on shared infrastructure.
- 5Reviews L3 architecture proposals and challenges constructively.
Expected outputs
- System architecture documents.
- Architecture evolution roadmap.
- Cross-team coordination notes on shared platforms.
What you would see
When the team needs to add multi-region support, L4 produces a 4-page system design covering data replication, failover, observability, and cost — with two options and a recommendation.
Common traps
- Architecture astronaut — designs for problems the team does not have.
- Does not update designs when reality diverges.
L5
"I design systems used across the department. Architecture is a strategic lever, not a technical detail."
Designs at department level. Cross-ODC patterns, shared services, technology choices. Influences department strategy through architecture.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Designs reference architectures used across multiple ODCs | ≥1 reference architecture adopted by ≥2 ODCs/year | Confluence reference architecture + adoption record |
| Sets allowed languages / frameworks / cloud services for department | ≥1 technology standards doc maintained, reviewed quarterly | Confluence tech standards doc + review log |
| Defines shared services (auth, observability, data platform) | Owns ≥1 shared platform service used by ≥3 teams | Service ownership registry + dependent-team count |
| Modernization roadmaps authored | ≥1 multi-quarter modernization roadmap/year | Confluence roadmap doc + steering review |
| Consults on architecture for major projects across department | ≥3 architecture consultations/quarter from other ODCs | Slack/email consultation log |
Observable behaviors
- 1Designs systems and reference architectures used across ODCs.
- 2Defines shared services (auth, observability, data platform) for the department.
- 3Sets technology standards (allowed languages, frameworks, cloud services).
- 4Consults on architecture decisions for major projects across the department.
- 5Drives technology modernization strategy (e.g., move from monolith to services, AI integration).
Expected outputs
- Reference architectures for the department.
- Shared service designs.
- Technology standard documents.
- Modernization roadmaps.
What you would see
Designs the department reference architecture for new AI-integrated systems and gets buy-in from 4 ODC Leads and the Delivery Manager within 6 weeks.
Common traps
- Standards become dogma; ODCs lose flexibility.
- Reference architecture does not match actual system needs.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Treats architecture as someone else's job; does not try to understand it.
Over-designs small features. Or skips design and refactors twice.
Over-applies favorite patterns (everything becomes microservices). Validates design too late.
Architecture astronaut — designs for problems the team does not have.
Standards become dogma; ODCs lose flexibility.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
AI Tools & Modern Engineering
Weight ×1.5
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"AI writes code for me. I check if it runs."
Uses AI tools extensively to generate code — the bulk of L1 productivity comes from this. But often accepts AI output without deeply understanding why.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Uses Copilot/Claude Code for majority of code generation | AI used on ≥70% of PRs (boilerplate, syntax, scaffolding) | AI usage log / IDE telemetry |
| Reviews AI output before commit (no blind accept) | 100% AI-generated code has human-edited diff before commit | Git commit diff vs AI suggestion log |
| Logs AI usage notes / prompts to mentor | ≥1 prompt note shared with mentor/week | Confluence prompt log / mentor 1:1 |
| Asks mentor when AI output looks wrong | 100% hallucinations flagged to mentor before merge | Slack mentor thread + AI-flag log |
| Completes AI tools onboarding (prompt basics + review discipline) | 100% within 30 days of join | HR / Confluence AI onboarding checklist |
Observable behaviors
- 1Uses GitHub Copilot or Claude Code for the majority of code generation.
- 2Accepts AI suggestions and tests if the code runs.
- 3Does not yet have strong AI-review skills.
- 4Sometimes commits AI-generated code that is verbose or non-idiomatic.
Expected outputs
- AI-generated code that passes basic tests.
What you would see
PR has a 200-line function generated by AI when the team's idiomatic pattern would use 50 lines. L3 reviewer flags it.
Common traps
- Trusts AI output without critical review.
- Generates code that works but does not match the codebase style.
L2
"AI is a fast junior pair-programmer. I direct it and review its output."
Uses AI for the bulk of routine code. Has strong review skills — catches AI mistakes, hallucinations, and over-engineering before committing.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| AI used for boilerplate, tests, refactoring with good prompts | ≥40% productivity gain on routine tasks (self-measured) | Sprint velocity + AI usage log |
| Catches AI hallucinations before commit | 100% wrong API signatures / fake imports caught in self-review | PR review log: 0 hallucination merges |
| Switches between AI tools as task demands | ≥2 AI tools used routinely (e.g. Copilot + Claude Code) | AI usage log / IDE telemetry |
| Crafts good prompts (saved + reusable) | ≥3 reusable prompts saved in team library | Confluence / Skills prompt library |
| Knows limits of AI — abstains when appropriate | ≥2 documented 'AI not used here' rationales/quarter | PR description / design note |
Observable behaviors
- 1Uses AI for generation but reviews critically before commit.
- 2Catches AI hallucinations (functions that do not exist, wrong API signatures).
- 3Uses AI for boilerplate, tests, refactoring — not for design decisions.
- 4Crafts good prompts (specifies constraints, context, desired output format).
Expected outputs
- AI-assisted code that matches the codebase style.
- Self-correcting prompts when first output is wrong.
What you would see
L2 asks AI to write a function, scans the output, notices a missed edge case in the AI's logic, asks AI to revise with the edge case in mind, then ships.
Common traps
- Over-relies on AI for things AI is bad at (architecture, debugging complex issues).
- Spends time fixing AI output that would have been faster to write by hand.
L3
"AI helps me think — I use it to challenge my designs, generate test cases, and review my own code."
Uses AI not just to generate code but to design solutions at the functional level and review the team's code.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Fluent AI Agent usage daily | AI Agent (Claude Code / Copilot Agent) used daily on ≥1 task | AI Agent usage log / session telemetry |
| Productivity gain from AI | ≥30% productivity gain on owned work vs non-AI baseline | Sprint velocity comparison + AI usage log |
| Reviews 100% AI-generated code before commit | 100% AI-generated code reviewed + edited by self before commit | Git commit diff vs AI suggestion audit |
| Shares reusable prompts with team | ≥1 prompt contributed to team library/quarter | Confluence / Skills prompt library contributions |
| Zero AI-unreviewed code incidents | 0 production incidents traced to unreviewed AI-generated code | Incident postmortem AI-cause tagging |
Observable behaviors
- 1Uses AI to explore design alternatives before coding.
- 2Uses AI to generate edge cases and test scenarios.
- 3Uses AI to review own code and team PRs.
- 4Coaches L1 and L2 on effective AI usage (prompt patterns, review discipline).
- 5Knows the limits of AI — when not to use it.
Expected outputs
- AI-assisted design explorations.
- AI-generated test scenarios for own and others' code.
- AI-assisted code reviews with high-quality feedback.
What you would see
Before designing a new module, L3 asks Claude: "here is the requirement; give me 3 architectural options with trade-offs." Picks one, refines with team, then proceeds.
Common traps
- Becomes dependent on AI for thinking; original ideas atrophy.
L4
"AI is part of the team. I control how it is used, how much it costs, and what it produces."
Uses AI across the full pipeline — design, code, review, testing, documentation. Controls AI costs (token usage). Sets team AI standards.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Curates team AI workflow (Skills + Constitution) | Owns ≥1 team Skills doc + Constitution, reviewed quarterly | Confluence Skills doc + version history |
| Uses AI across all stages (design, code, review, test, docs) | ≥5 stages with AI integration documented | Confluence AI workflow doc |
| Tracks token cost per project + sets team AI standards | ≥1 cost dashboard live, monthly review | AI provider billing dashboard + Confluence standards |
| Reviews team AI-generated PRs for over-reliance | ≥80% AI-heavy PRs reviewed with AI-quality comment | GitHub PR review comments tagged 'ai-review' |
| Evaluates new AI tools + recommends adoption with cost data | ≥1 tool eval/quarter with recommendation memo | Confluence AI tool eval archive |
Observable behaviors
- 1Uses AI tools across all stages: design, code, review, test, docs.
- 2Tracks and controls AI cost (token usage, subscription cost) per project.
- 3Sets team standards for AI usage — when to use, what prompts work, what to never trust.
- 4Reviews team AI-generated PRs for over-reliance or quality issues.
- 5Evaluates new AI tools and recommends adoption.
Expected outputs
- Team AI usage standards.
- AI cost reports per project.
- AI tool evaluations and recommendations.
What you would see
Monthly engineering report includes AI cost per developer, AI-assisted PR rate, and an item like “moved test generation from Cursor to local Claude Code, saved $200/month, same quality.”
Common traps
- Lets AI cost run unchecked.
- Mandates AI usage to a degree that creates resentment in the team.
L5
"AI is changing how software is built. I lead how the department adapts."
Like L4 but at the department level. Researches new AI capabilities. Guides team adoption. Influences hiring profile and training.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Sets D3 AI philosophy used as Techvify template | ≥1 philosophy doc adopted by ≥2 ODCs | Confluence AI philosophy + adoption record |
| Demonstrates new AI capabilities to department monthly | ≥1 demo/month to department audience | Calendar + demo recording archive |
| Defines what 'good AI usage' looks like at each level | ≥1 leveling rubric, adopted ODC-wide | Confluence AI rubric + HR adoption |
| Trains other engineers on advanced AI usage | ≥4 training sessions/year, ≥20 engineers attended | Training log + attendance record |
| Researches emerging AI capabilities + guides department adoption | ≥1 research memo/quarter influencing tooling decision | Confluence research memo + adoption decision |
Observable behaviors
- 1Researches new AI tools and capabilities as they emerge.
- 2Guides department adoption of new tools.
- 3Trains other engineers on advanced AI usage.
- 4Influences hiring profile — what AI skills do we look for at each level?
- 5Shapes department-level AI cost strategy.
Expected outputs
- Department AI strategy.
- Training materials on new AI tools.
- Hiring profile updates reflecting AI requirements.
What you would see
Hosts a monthly internal session: "what is new in AI engineering this month" — demonstrates a new capability, recommends adoption or pass.
Common traps
- Chases every new AI tool — burns team time on experimentation.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Trusts AI output without critical review. Code works but does not match codebase style.
Over-relies on AI for things AI is bad at (architecture, complex debugging).
Becomes dependent on AI for thinking; original ideas atrophy.
Lets AI cost run unchecked · mandates AI usage to a degree that creates resentment.
Chases every new AI tool — burns team time on experimentation.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
Problem Solving & Complexity
Weight ×1.5
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"There is a process; I follow it."
Handles problems with clear procedures. When asked something off-process, asks for guidance.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Asks mentor when stuck rather than stalling | ≤30 min stuck before asking, several times/day in first 3 months | Slack mentor thread + 1:1 notes |
| Follows documented process for known bug classes | 100% known-class bugs follow runbook (mentor verifies) | Mentor PR review sign-off |
| Files post-resolution learning note | ≥1 learning note per non-trivial bug fix | Confluence learning log |
| Researches before asking (docs, AI, codebase) | ≥3 research artifacts shown per asked question (in first 3 months) | Slack mentor thread evidence |
| Problem scope owned | 1 bug at a time within own feature scope | Jira assignee + scope |
Observable behaviors
- 1Follows clear processes — "here is how we handle this kind of bug."
- 2Asks the L3 mentor when the problem does not fit any process.
- 3Limited problem decomposition — sees problem as one big thing.
Expected outputs
- Process-following solutions.
What you would see
When the build is broken in CI, L1 looks for similar past failures, finds the doc, follows the steps.
Common traps
- Stalls when no clear process exists; does not try to figure it out.
L2
"There are multiple approaches; I pick the right one."
Recognizes that a problem can be solved several ways and chooses one. Decomposes medium-complex problems.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Researches solutions independently (docs, SO, AI) | ≥80% bugs resolved without TL involvement | Jira resolver vs assignee log |
| Recognizes multiple approaches exist | ≥1 PR/sprint with alternative-considered note | PR description / design note |
| Decomposes medium-complex problems into subtasks | ≥1 Jira epic broken into ≥5 subtasks per sprint | Jira epic-to-subtask tree |
| Picks most appropriate approach based on context | ≥80% PRs cite rationale for approach (vs alternatives) | PR description audit |
| Owns single medium-complex feature end-to-end | ≥1 feature/sprint sole-owner | Jira feature assignee + PR author |
Observable behaviors
- 1Recognizes that multiple approaches exist for the same problem.
- 2Picks the most appropriate approach based on context.
- 3Decomposes medium-complex problems into subtasks.
- 4Researches solutions independently (docs, blog posts, Stack Overflow, AI).
Expected outputs
- Working solutions with brief rationale for the approach.
What you would see
Given a performance problem, L2 considers caching, indexing, query optimization, then picks one based on the data access pattern and runs an experiment.
Common traps
- Picks the first plausible solution; does not weigh alternatives.
L3
"Most real problems are ambiguous. I analyze before I act."
Handles ambiguous problems where the path is not clear. Decomposes complex problems into solvable parts. Identifies the actual question to answer.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Independent RCA within 4 hours | Root cause analysis completed <4h solo on own subsystem | Incident postmortem timestamps + authorship |
| Drives systemic improvements from incidents | ≥1 systemic improvement shipped/quarter (process, tooling, code) | Confluence improvement archive + PR series |
| Debugs cross-stack issues solo | ≥80% cross-stack bugs resolved solo (FE+BE, app+infra) | Jira resolver log + bug-class tagging |
| Decomposes complex problems into module-level work | ≥1 complex epic broken into module-level plan/quarter | Confluence problem decomposition doc |
| Owns medium-complex problems across own subsystem | ≥2 medium-complex problems owned end-to-end/quarter | Jira / Confluence ownership record |
Observable behaviors
- 1Handles ambiguous problems — talks to stakeholders to clarify before acting.
- 2Identifies the actual question vs the surface question.
- 3Decomposes complex problems systematically.
- 4Runs experiments to validate assumptions before committing to a solution.
- 5Spots hidden complexity early (e.g., "this looks like 1 sprint but the data migration is 3 sprints").
Expected outputs
- Problem decomposition documents.
- Experiment plans with hypothesis and results.
- Estimates that account for hidden complexity.
What you would see
Given a vague client request like "make the app faster," L3 instruments first to find the actual bottleneck, identifies 3 candidate fixes, presents trade-offs, and the team picks one.
Common traps
- Over-analyzes — paralysis by analysis.
L4
"I orchestrate problem-solving across teams and systems."
Handles problems spanning multiple teams, services, or domains. Coordinates cross-team solutions.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Identifies systemic issues vs one-off across domains | ≥2 systemic findings/quarter with documented evidence | Confluence systemic analysis archive |
| Mentors L3s on complex problem decomposition | ≥2 L3s coached/quarter with documented sessions | Confluence coaching log |
| Brings in outside TLs/SAs when domain calls | ≥3 cross-team consultations/quarter pulled in | Slack consultation thread log |
| Joint debugging sessions across service owners | ≥1 joint debug session/month facilitated | Calendar + meeting notes |
| Owns problems spanning multiple teams/services | ≥1 multi-team problem owned/quarter | Jira / Confluence ownership record |
Observable behaviors
- 1Handles problems that span multiple teams, services, or domains.
- 2Orchestrates cross-team problem solving (joint debugging sessions, shared task forces).
- 3Identifies systemic issues vs one-off issues.
- 4Mentors L3s on complex problem decomposition.
- 5Brings in outside expertise (other Tech Leads, SAs) when domain calls for it.
Expected outputs
- Cross-team problem-solving plans.
- Systemic issue write-ups with proposed fixes.
- Coaching notes on complex problem-solving.
What you would see
When 3 services in the system are sporadically failing, L4 sets up a joint debugging session with the 3 service owners, captures the shared root cause (a downstream dependency timing out), and drives the fix across all 3 services.
Common traps
- Tries to solve cross-team problems alone instead of coordinating.
L5
"I solve problems at the level where one good decision saves the department a year of work."
Handles strategic problems — technology choices, system rewrites, capability builds. Decisions impact the whole department.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Frames problems for executive-level decisions | ≥2 exec-framed problem memos/year | Confluence exec memo archive |
| Considers second-order effects (talent, ecosystem, training) | ≥1 multi-page analysis/quarter covering ≥3 second-order dimensions | Confluence analysis doc |
| Build vs buy / monolith vs services / cloud strategy decisions | Co-authors ≥2 strategic-tech decisions/year | Confluence strategy memo + leadership sign-off |
| Multi-page analyses covering cost, talent, risk, timeline | ≥2 analyses/year with all 4 dimensions | Confluence analysis doc audit |
| Owns strategic problems with department-wide impact | ≥1 department-impact problem owned/year | Confluence problem ownership + outcome record |
Observable behaviors
- 1Handles problems at strategic level (build vs buy, monolith vs services, cloud strategy).
- 2Frames problems for executive-level decisions.
- 3Considers second-order effects (talent, hiring, training, ecosystem).
- 4Drives consensus across multiple stakeholders.
Expected outputs
- Strategic decision documents with options, trade-offs, recommendation.
- Cross-stakeholder alignment notes.
What you would see
When the department considers migrating from on-prem to cloud, L5 writes a 6-page analysis covering cost, talent, risk, and timeline, presents to Delivery Manager, and gets a decision in 2 meetings.
Common traps
- Strategic decisions drift without enforcement.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Stalls when no clear process exists; does not try to figure it out.
Picks the first plausible solution; does not weigh alternatives.
Over-analyzes — paralysis by analysis.
Tries to solve cross-team problems alone instead of coordinating.
Strategic decisions drift without enforcement.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
Communication & Collaboration
Weight ×1.5
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"I ask for what I need and report what I did."
Basic communication. Asks for information, provides updates, mostly within own team.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Daily standup attendance + reports yes/today/blockers | ≥90% attendance, 30-sec updates | Calendar + standup notes |
| Weekly written reports to mentor | ≥1/week with clear what-done + asks | Confluence mentor log / Slack DM |
| Clarifying questions to mentor on requirements | ≥3 questions/sprint refinement | Meeting notes / Slack thread |
| Audience reach within team | ≥90% comms within team, ≤10% partner-dev | Slack channel breakdown / meeting audience log |
| Asks before sending external (client) comms | 100% external comms reviewed by senior before send | Email draft review trail |
Observable behaviors
- 1Daily report on work-in-progress to the team.
- 2Asks the mentor or L3 when stuck.
- 3Participates in standup; speaks when called on.
- 4Limited interaction with partner devs (around 10% time).
Expected outputs
- Daily and weekly reports.
- Clarifying questions to mentor.
What you would see
In standup, L1 reports yesterday/today/blockers in 30 seconds. Asks one question in Slack per day.
Common traps
- Stays silent when stuck for too long.
L2
"I work with partner devs daily. I keep the team informed."
Comfortable in cross-team and partner conversations. Writes clear emails and PR descriptions.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Clear PR descriptions (what + why) | ≥95% PRs follow team description template | GitHub PR description audit |
| Daily partner-dev conversations | 50% team / 50% partner-dev daily | Slack channel breakdown / meeting audience log |
| Active in design discussions (proposes structures) | ≥1 proposal contribution/design session | Design review meeting notes |
| Asks clarifying questions on requirements with BA | ≥3 questions per refinement session | Meeting notes / Confluence |
| Cross-team Slack and PR review engagement | ≥5 substantive cross-team comments/week | Slack + GitHub cross-team thread audit |
Observable behaviors
- 1Engages with partner devs daily (Slack, PR reviews, standups).
- 2Writes clear PR descriptions explaining what changed and why.
- 3Participates actively in design discussions.
- 4Asks clarifying questions on requirements with BA.
- 5Daily and weekly reports include context, not just task list.
Expected outputs
- Clear PR descriptions.
- Cross-team Slack discussions.
- Design discussion contributions.
What you would see
When the partner dev posts a vague API spec, L2 responds with 3 specific clarifying questions and proposes a struct that fits both sides.
Common traps
- Tone-deaf in writing — comes across blunt to international partners.
L3
"I influence technical decisions through clear logic and data, not just preference."
Persuades with logic and data. Communicates technical decisions to non-engineers. Engages directly with client Tech Lead.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Substantive client-facing technical comms | ≥3 substantive client tech comms/week (email, call, PR thread) | Email + Slack + call log audit |
| Awareness of Sprint-level technical decisions | 100% awareness of Sprint tech decisions (can articulate any solo) | Sprint review check + 1:1 verification |
| Counters peer/lead with rationale | ≥1 substantive counter-position/Sprint with written rationale | PR thread + design review log |
| Contributes substantively to ceremonies | ≥3 substantive contributions per ceremony (refinement, planning, retro) | Ceremony meeting notes audit |
| Documents trade-offs in design comms | ≥2 trade-off docs authored/Sprint | Confluence trade-off doc archive |
Observable behaviors
- 1Persuades technical decisions using data, not opinion.
- 2Communicates technical trade-offs to BA, PM, client.
- 3Runs technical discussions with the client Tech Lead.
- 4Writes design docs that get adopted.
- 5Gives structured PR review feedback (not just nitpicks).
- 6Active in cross-team architecture discussions.
Expected outputs
- Design documents adopted by the team.
- Technical briefings to non-engineering stakeholders.
- Structured PR review feedback.
What you would see
In a 4-way discussion (client Tech Lead, our Tech Lead, BA, L3), the L3 frames a contentious technical decision with data, proposes 2 options, and gets alignment in 30 minutes.
Common traps
- Wins arguments but loses relationships.
L4
"Most of my day is alignment work. I broker decisions across teams."
Brokers technical alignment across multiple teams and departments. Negotiates with partner managers. Drives consensus.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Cross-team alignment notes + stakeholder briefings | ≥1 briefing/sprint to ≥1 non-team stakeholder | Confluence briefing archive |
| Drives technical consensus across multiple teams | ≥2 cross-team consensus sessions led/quarter | Calendar + decision-doc authorship |
| Negotiates technical scope + timeline with partner managers | ≥1 scope/timeline negotiation closed/quarter solo | Email/Confluence negotiation record |
| Runs 1-hour resolution sessions between disagreeing sub-teams | ≥2 resolution sessions/quarter as facilitator | Calendar + meeting notes |
| Audience reach: 20% team / 30% cross-team / 20% other dept / 30% partner mgrs | Mix sustained over 2+ quarters | Calendar audience breakdown / Slack log |
Observable behaviors
- 1Coordinates with multiple sub-team Tech Leads.
- 2Negotiates with partner manager on technical scope and timeline.
- 3Drives technical consensus across teams.
- 4Communicates with non-technical stakeholders effectively.
- 5Mentors L3s on stakeholder communication.
Expected outputs
- Cross-team alignment notes.
- Negotiation outcomes with partners.
- Stakeholder briefings.
What you would see
When 2 sub-teams disagree on a shared API design, L4 runs a 1-hour resolution session, captures the decision in writing, and both teams execute.
Common traps
- Becomes a meeting machine; loses time to think.
L5
"I represent the technology of the department to the outside world."
Represents the department externally — to partners, prospects, and other departments. Influences strategy through communication.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Presales technical proposals authored | ≥2 proposals/year with sales close | Sales/Confluence proposal archive |
| External-facing talks and writing | ≥2 external talks/year (conference, podcast, blog) | Marketing/External speaking log |
| Mediates hard architecture trade-offs without overruling | ≥3 mediated decisions/year with both-sides sign-off | ADR / decision doc with multi-party sign-off |
| Negotiates technology strategy with partners + prospects | ≥2 partner/prospect strategy sessions/quarter solo | Calendar + partner meeting notes |
| Chairs steering committees + external presentations | ≥1 steering committee chaired/quarter | Calendar + steering minutes |
Observable behaviors
- 1Represents the department to prospects in presales.
- 2Communicates technology direction to other departments.
- 3Negotiates with partners on technology strategy.
- 4Speaks at internal and external events.
Expected outputs
- Presales technical proposals.
- Technology direction briefings.
- External-facing talks and writing.
What you would see
In a 3-way conversation between the client CTO, Delivery Manager, and the SA, the SA mediates a hard architecture trade-off with neither side feeling overruled.
Common traps
- Loses contact with engineers; becomes a salesperson.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Stays silent when stuck for too long.
Tone-deaf in writing — comes across blunt to international partners.
Wins arguments but loses relationships.
Becomes a meeting machine; loses time to think.
Loses contact with engineers; becomes a salesperson.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
English Proficiency
Weight ×1
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"I write what I need to write. Speaking is harder."
Can read technical docs in English. Writes basic emails, internal reports. Internal meetings OK; client meetings supervised.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| CEFR band | B1 (intermediate) minimum at hire | HR English test record / certificate |
| Reads technical documentation in English | 100% standard docs/specs read unaided | Mentor verbal check + reading-log |
| Basic emails, code comments, daily reports written | ≥1 written artifact/day in English | Email / Confluence / PR comment archive |
| Attends internal English meetings | ≥90% attendance, may not lead | Calendar attendance |
| Client-facing English supervised by senior | 100% client comms reviewed before send | Email draft review trail |
Observable behaviors
- 1Reads technical documentation in English.
- 2Writes basic emails and reports.
- 3Attends internal meetings in English.
- 4May not yet handle direct client conversation.
Expected outputs
- English-language daily/weekly reports.
- Basic English code comments and PR descriptions.
What you would see
L1 writes a report with mostly correct grammar but occasional awkward phrasing.
Common traps
- Avoids English speaking entirely; misses learning opportunities.
L2
"I can talk to partner devs daily without help."
Comfortable in daily English conversations with partner devs and stakeholders. Writes clear technical communication.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| CEFR band | B2 (upper intermediate) | HR English test record / certificate |
| Clear PR descriptions, design notes, emails in English | ≥95% PR descriptions in clear English (peer-readable) | GitHub PR audit + peer review |
| Daily conversations with partner devs in English | ≥4 days/week speaking with partner-dev | Calendar + standup audience log |
| Explains bug fix in client standup | ≥3 client-standup explanations/week solo | Client standup notes |
| Comfortable across technical threads and specs | ≥90% threads engaged without senior interpretation | Slack/email engagement audit |
Observable behaviors
- 1Daily English conversations with partner devs.
- 2Writes clear PR descriptions, design notes, emails.
- 3Asks and answers questions in client standups.
Expected outputs
- Clear English written communication.
- Confident verbal communication on technical topics.
What you would see
L2 in a client standup explains a bug fix in 30 seconds in clear English.
Common traps
- Comfortable on technical topics, struggles when conversation goes personal or strategic.
L3
"Language is no longer a barrier for me."
Same as L2 plus handles complex technical conversations with clients confidently.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| CEFR band verified | B2+ (upper intermediate to advanced), formally verified | HR English test record / certificate |
| Client-facing technical discussions solo | ≥3 client technical discussions/week led solo (no senior support) | Calendar host field + meeting notes |
| Writes ADRs in English | 100% authored ADRs in English (peer-reviewed clear) | Confluence ADR author log + peer review |
| Engages in async written threads independently | ≥95% async threads engaged without senior translation | Slack/email engagement audit |
| Audience reach in EN: partner devs + partner leads | ≥40% of comms to partner devs / partner leads in English | Calendar audience breakdown |
Observable behaviors
- 1Handles complex technical conversations with clients.
- 2Writes design docs in clear English.
- 3Participates in cross-team discussions effectively.
Expected outputs
- Clear English design docs and technical briefings.
- Confident client-facing technical conversations.
What you would see
In a heated technical discussion with the client Tech Lead, L3 holds ground and explains the trade-off without losing clarity.
Common traps
- Plateaus at "good enough" without pushing toward C1.
L4
"I chair the room. I negotiate. I present."
Chairs client meetings. Negotiates technical scope. Presents to senior client stakeholders. Coaches juniors on English communication.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| CEFR band | C1 (proficient) | HR English test record / certificate |
| Chairs client meetings end-to-end | ≥4 client meetings chaired/week solo | Calendar host field + meeting notes |
| Negotiated agreements + presentations to senior stakeholders | ≥1 negotiated agreement/quarter authored solo | Confluence agreement archive |
| Reads + synthesizes business and technical material together | ≥1 synthesis brief/month (biz + tech) | Confluence brief archive |
| Audience reach in EN: senior client + partner managers | ≥30% of comms to senior client / partner mgrs | Calendar audience breakdown |
Observable behaviors
- 1Chairs client meetings end-to-end.
- 2Negotiates technical scope and timeline with partner managers.
- 3Presents to senior client stakeholders.
- 4Coaches juniors on writing and speaking.
Expected outputs
- Chaired meeting outcomes.
- Negotiated agreements with partners.
- Presentations to senior stakeholders.
What you would see
L4 runs a client steering committee for an hour, balancing technical and business topics, with the client signing off on the recommendations.
Common traps
- Speaks confidently but writing still has rough edges.
L5
"I represent the department in English. My language is part of our brand."
Department-level English. Presales presentations. Strategic writing. Sets language standards.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| CEFR band | C1+ (proficient to native-like) | HR English test record / certificate |
| Presales presentations to Fortune-500 prospects | ≥2 presales presentations/year (45-min + Q&A) | Sales/Marketing presentation log |
| Presales proposals + department-level documents authored | ≥2 presales proposals/year + ≥1 dept doc/quarter | Confluence/Sales proposal archive |
| Sets writing standards across ODC/department | ≥1 writing standards doc adopted ODC-wide | Confluence standards + adoption record |
| Audience reach: 25% in-dept / 25% other-dept / 50% partners + prospects | Mix sustained over 2+ quarters | Calendar audience breakdown |
Observable behaviors
- 1Presales presentations to prospects.
- 2Strategic writing for department-level docs.
- 3Sets English-writing standards for the department.
Expected outputs
- Presales proposals and presentations.
- Department-level documents.
What you would see
L5 presents the department capability to a Fortune-500 prospect in a 45-minute talk and Q\&A; prospect signs LOI within 2 weeks.
Common traps
- Polished English but jargon-heavy; loses non-native audiences.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Avoids English speaking entirely; misses learning opportunities.
Comfortable on technical topics, struggles when conversation goes personal or strategic.
Plateaus at 'good enough' without pushing toward C1.
Speaks confidently but writing still has rough edges.
Polished English but jargon-heavy; loses non-native audiences.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
Mentoring & Technical Leadership
Weight ×1
L1
1
L2
2
L3 · hire bar
3
L4
4
L5
5
L1
"I am still being mentored. I do not mentor."
Contributes own work. Does not yet mentor.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Receives mentoring from L3/L4 (no mentoring yet) | ≥2 mentor 1:1s/month received | Calendar + mentor notes |
| Files learning notes for future juniors | ≥1 learning note/week to team wiki | Confluence learning archive |
| Asks questions in team channel | ≥3 questions/week in team Slack | Slack #team-channel audit |
| Reviews L1 PRs as observer (no approval power) | ≥1 PR shadow-review/week with mentor | GitHub PR comment + mentor sign-off |
| Promotion / hiring involvement | 0 (none expected at L1) | HR record (intentional absence) |
Observable behaviors
- 1Receives mentoring from L3 or L4.
- 2Asks questions, learns.
Expected outputs
- Personal work output.
What you would see
In the team channel, L1 is mostly asking questions and being answered.
Common traps
- Does not document learnings for future juniors.
L2
"I help L1s when asked."
Mentors 1-2 L1s informally. Reviews their PRs. Answers questions.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Mentors 1-2 L1s informally | ≥1-2 L1s with a few hours/week coaching | Mentor 1:1 calendar + notes |
| Reviews L1 PRs constructively | ≥3 L1 PRs reviewed/week with substantive comments | GitHub review comment depth + count |
| Answers L1 questions in Slack / in person | ≥5 L1 questions answered/week | Slack thread audit |
| Walks juniors through tricky sessions | ≥1 pairing session/week with L1 | Calendar pairing session + notes |
| Informal mentoring notes filed | ≥1 mentor note/L1/month | Confluence mentor log |
Observable behaviors
- 1Mentors 1-2 L1s informally (a few hours per week).
- 2Reviews L1 PRs constructively.
- 3Answers L1 questions in Slack or in person.
Expected outputs
- L1 PR reviews.
- Mentoring notes (informal).
What you would see
Friday afternoon, L2 spends 1 hour walking an L1 through a tricky debugging session.
Common traps
- Mentors only on technical topics; ignores career or behavioral aspects.
L3
"I am the technical anchor of the team. My job is to make the team succeed, not just me."
Mentors L1 and L2 actively. Sets the technical bar. Reviews everyone's PRs. Coaches on design and quality.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Active L1 mentees | 1-2 L1 mentees with active weekly 1:1 cadence | Mentor 1:1 calendar + notes |
| Design-level PR reviews on mentee work | ≥80% mentee PR reviews include design-level feedback | GitHub review comment audit |
| Grows engineers | ≥1 engineer demonstrably grown to next level/year | HR promotion record + mentor attribution |
| Coaching session breadth | ≥1 coaching session/month covering breadth (career, tech, soft skill) | Confluence coaching log |
| Shares knowledge with broader team | ≥1 knowledge-share session/quarter (brown bag, demo, doc) | Calendar + Confluence knowledge-share archive |
Observable behaviors
- 1Actively mentors L1 and L2 (regular 1:1s, structured feedback).
- 2Sets the technical bar — what 'good' looks like in this team.
- 3Reviews everyone's PRs systematically.
- 4Coaches on design, quality, and career.
- 5Develops at least one engineer toward the next level.
Expected outputs
- Mentoring notes per mentee.
- Coaching feedback after key reviews.
- Promotion recommendations to ODC Lead or Tech Lead.
What you would see
L3 has a weekly 30-min 1:1 with each L1/L2, focused on growth not just status. Junior engineers visibly improve quarter over quarter.
Common traps
- Hero engineer — fixes everything personally instead of teaching.
L4
"I evaluate and grow engineers. I make hire/replace decisions. I shape the team."
Manages 3-10 people across teams. Formal performance evaluation. Hiring and replacement decisions. Develops L3s into Tech Leads.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Manages 3-10 across teams + develops L3s into Tech Leads | ≥3 L3s with documented Tech-Lead IDP | HR IDP records + 1:1 log |
| Builds team culture and norms across multiple teams | ≥1 norms doc owned, adopted by ≥2 teams | Confluence norms + adoption record |
| Formal performance evaluations + hiring decisions + succession plans | 100% direct reports have formal review + IDP + named successor | HR review + succession plan |
| Clear bench of L3s ready for Tech Lead within 12 months | ≥2 L3s named in Tech-Lead pipeline with timeline | HR succession plan + IDP |
| Coaches across multiple teams on technical + behavioral | ≥2 cross-team coachees/quarter with documented log | Confluence coaching log |
Observable behaviors
- 1Formal performance evaluation for L1, L2, L3 reports.
- 2Hiring decisions and replacement recommendations.
- 3Develops L3s toward L4 (Tech Lead).
- 4Builds team culture and norms.
- 5Coaches across multiple teams.
Expected outputs
- Performance evaluations.
- Hiring recommendations.
- Succession plans.
- Team culture artifacts (rituals, norms).
What you would see
L4 has a clear bench of L3 engineers who could step into Tech Lead role within 12 months; one is already shadowing.
Common traps
- Plays favorites; calibration becomes biased.
L5
"I shape the technical capability of the entire department."
Influences department-level hiring, training, technology choices. Mentors L4 Tech Leads. Sets engineering culture.
Measurable indicators(pQA verification checklist)
| Indicator | Threshold | pQA source |
|---|---|---|
| Mentors L4 Tech Leads + hosts quarterly Tech Lead peer forum | ≥2 L4s mentored sustained + ≥1 forum/quarter | Confluence L4 mentorship log + forum agenda |
| Sets engineering culture and standards for department | ≥1 culture/standards doc adopted ODC-wide | Confluence culture doc + adoption survey |
| Interviews senior candidates + influences hiring profile | ≥1 senior interview/month + ≥1 hiring rubric/year | HR interview panel log + rubric authorship |
| Influences department training + capability building | ≥1 training program co-owned, ≥20 engineers trained/year | Confluence training program + attendance |
| Shapes technical capability of entire department | Authors ≥1 quarterly tech health report for department | Confluence dept tech health report archive |
Observable behaviors
- 1Interviews senior candidates.
- 2Mentors L4 Tech Leads.
- 3Sets engineering culture and standards.
- 4Influences department hiring profile and training.
Expected outputs
- Senior interview evaluations.
- Tech Lead mentoring notes.
- Department engineering culture artifacts.
What you would see
Hosts a quarterly Tech Lead forum where L4s present hard problems and get peer review from each other and the SA.
Common traps
- Loses touch with daily engineering reality.
Anti-signal & scoring rule3-of-5 majority
Dimension
L1 · Aware
L2 · Familiar
L3 · Proficient
L4 · Advanced
L5 · Expert
Anti-signal · auto-disqualifies the level
Rejects
Does not document learnings for future juniors.
Mentors only on technical topics; ignores career or behavioral aspects.
Hero engineer — fixes everything personally instead of teaching.
Plays favorites; calibration becomes biased.
Loses touch with daily engineering reality.
Scoring rule: You are at Level N if you hit Level N in at least 3 of 5 dimensions, AND no dimension is below Level (N−1). Sort levels descending — the 3rd value is your overall level. If the lowest drops more than 1 below, demote 1 level.
How to advance — promotion criteria
L1L2
Typical 12–18 monthsWhat to demonstrate
- L2 in all 8 competencies
- L3 in Tech Mastery AND Code Quality
- Ships features independently in primary stack
Portfolio (CORE)
- 5 merged PRs with substantive scope
- Stack declaration (primary + secondary)
- 1 self-debugged production issue
Verification path
1Self-assess + Tech Lead review
2Submit portfolio
3Annual review sign-off
L2L3
Typical 18–24 monthsWhat to demonstrate
- L3 in all 8 competencies
- L4 in Technical Mastery
- L3+ in System Design AND demonstrated mentoring readiness
- Owns a module in the codebase
Portfolio (CORE)
- 1 ADR or design doc for non-trivial feature
- 1 module ownership record (6+ months)
- Postmortem from a P1 in their module
- Mentoring evidence (onboarded junior OR PR review at scale)
Verification path
1Self-assess + Tech Lead calibration
2Submit portfolio
3Tech Lead + ODC Lead sign-off
L3L4
Choose Tech Lead or Principal Engineer trackWhat to demonstrate
- Tech Lead: L4 all 8 + L5 Tech Mastery OR System Design + people leadership
- Principal Eng: L5 Tech Mastery + L4+ System Design + technical leadership without people management
- Track-specific evidence per sub-track
Portfolio (CORE)
- System design adopted by team or project
- Tech Lead: hiring panel participation + IDP for 1 report
- Principal: technical RFC adopted by another team
Verification path
1Declare preferred track
2Multi-reviewer calibration
3ODC Lead + Head of D3 sign-off
L4L5
0–2 person seat at current org scaleWhat to demonstrate
- L5 in System Design AND Tech Mastery
- Cross-team or cross-ODC technical influence
- Justified org need for an L5 seat
Portfolio (CORE)
- System designed adopted across multiple projects
- Reference architecture published
Verification path
1Department-level review
2Submit portfolio
3Head of D3 + Delivery Director sign-off
Alternative IC tracks at L4 / L5
L4 · Principal Engineer (IC)
Drives technical direction through code, architecture, and influence — not through team management.
When to consider
- You're a brilliant IC who doesn't want to manage people
- You'd rather influence through code reviews and design than 1-on-1s
- You want to grow technical depth without hitting a management ceiling
Profile fit
Strong: Tech Mastery (L5), System Design (L4+), Problem Solving (L4+)
Relaxed: Mentoring (L3 OK), Communication (L3 OK for internal)
Relaxed: Mentoring (L3 OK), Communication (L3 OK for internal)
Comp parity
Same as Tech Lead
Reversible
Yes, 1×/18mo
L5 · Distinguished Engineer (IC)
Capped 0–2 person seat. Created only when justified by org need. Sets technical standards across multiple projects.
When to consider
- You've grown into broader technical influence across the unit
- You mentor Tech Leads on technical depth
- You represent D3's engineering maturity to clients
Profile fit
Strong: Tech Mastery (L5), System Design (L5), Problem Solving (L5)
Comp parity
Same as Solution Architect
Seat count
0–2 only
How to move between tracks
1. Declare interest2. 1-on-1 with Tech Lead + ODC Lead3. 1-cycle trial4. Confirm next quarter
FAQ for SE role
FAQ items will be added as questions surface from the team.
Self-assessment
Self-assess your SE level
Score yourself on each dimension and get a live radar chart of your competency profile. Scores stay in your browser — never saved or shared.