The Macro: Grading Is the Part of Teaching That Nobody Signed Up For
Ask any STEM professor what they spend most of their time on, and the answer is grading. Not lecturing, not research, not office hours. Grading. A single section of introductory calculus with 200 students generates thousands of pages of handwritten work every semester. Each problem needs to be read, interpreted, checked for correctness, checked for partial credit, and annotated with feedback. It is skilled labor. It is repetitive. And it takes forever.
The standard solution is teaching assistants, but TAs are expensive, inconsistent, and increasingly hard to find. A study from the National Center for Education Statistics found that the average STEM professor spends 10 to 15 hours per week on grading-related tasks. For adjuncts teaching multiple sections at different institutions, that number can hit 25 hours. That is not sustainable, and it is one of the main drivers of burnout in higher education.
The AI grading space has been growing. Gradescope, acquired by Turnitin in 2018, is the most established player. It handles rubric-based grading and has good adoption at research universities. Crowdmark focuses on collaborative grading workflows. Codio handles automated code grading. But most of these tools work best with typed, structured inputs. Handwritten STEM work, the kind with integrals scrawled in margins and circuit diagrams sketched freehand, has been harder to automate because it requires both OCR and domain-specific reasoning.
That is the gap. STEM grading specifically, with handwritten notation, partial credit logic, and step-by-step feedback, is a problem that previous tools have only partially solved.
The Micro: An AI Teaching Assistant That Actually Reads Handwriting
GradeWiz is an AI grading assistant built specifically for STEM courses. It handles math notation, code, and handwritten diagrams. The core pitch is that a professor can upload a stack of scanned assignments and get them graded with step-by-step feedback, saving what the company claims is four or more hours per week.
The numbers so far are credible. GradeWiz has already graded over 30,000 submissions at universities including Penn State, Cornell, Hunter College, Cal Poly, and Syracuse. For an early-stage EdTech product, that is meaningful traction. University adoption is notoriously slow, involving procurement processes, faculty committee approvals, and IT security reviews. Getting into five recognizable institutions this early suggests the product is solving a real pain point.
Max Bohun founded the company in 2024. The team is two people, based in San Francisco, and part of YC’s Winter 2025 batch. Details on the second team member are not publicly listed, but the company is clearly operating lean and letting the product speak for itself.
The technical challenge here is substantial. Handwritten math is hard for AI. The difference between a lowercase “a” and a partial derivative symbol is contextual. A student might write a perfectly correct proof using non-standard notation. Partial credit requires understanding not just whether the answer is right, but where the reasoning went wrong and how much credit that partial reasoning deserves. These are judgment calls that even experienced TAs get inconsistent on.
If GradeWiz is getting this right, and the university adoption suggests it is at least getting it right enough, that is a meaningful technical achievement. Gradescope can do rubric-based grading well, but it still relies heavily on human graders for handwritten work. A product that genuinely automates that step has a real competitive advantage.
The Verdict
I think GradeWiz is attacking a problem that is both urgent and underserved. The EdTech market is full of tools that help with course management, student engagement, and content delivery. Very few tools help with grading, and even fewer handle the hardest version of the grading problem, which is handwritten STEM work.
The risk is the accuracy bar. In education, a grading error is not just a bug. It is a grade dispute, a student complaint, a department head getting involved. Professors will adopt AI grading only if they trust it to be at least as accurate as their best TA. That is a high bar, and maintaining it across different courses, notation styles, and difficulty levels is going to require continuous improvement.
Thirty days from now, I want to see whether professors are using GradeWiz for high-stakes exams or just homework. That distinction matters because homework tolerance for error is higher. Sixty days, the question is whether GradeWiz is expanding within its existing universities, with new departments and courses adopting it, or if adoption is stalling after the initial champions. Ninety days, I want to know the accuracy numbers. What percentage of AI-graded assignments get manually overridden by the professor? If that number is under 5%, this is a product that will spread through every STEM department in the country. If it is over 15%, it is a homework-only tool with a lower ceiling.