
Verification limits
Recently there were a few good articles on Verifier Rule which points out logically that some tasks are easier to verify than solve and some are the opposite. So AI is aiming at solving through RL post-training most of the verifiable tasks soon leave disadvantaged verifier asymetrical tasks to be automated as a long tail long term research.
RL here just a method of getting verification signal back to model weights. We ask model to try solving a task many times and when verifier is happy we update the weights according to the reward signal.
Tasks that are easier to verify than solve
-
Sudoku and Logic Puzzles: As mentioned in the article, solving a Sudoku puzzle requires navigating a large tree of possibilities and constraints. However, once the grid is filled, verifying the solution takes mere seconds—you simply check if every row, column, and box contains digits 1–9.
-
Software Engineering: Writing the code to build a complex platform (like Instagram) takes teams of engineers years of development. In contrast, verifying that the "solution" works can be done by a layperson in seconds simply by opening the app and seeing if the feed loads.
-
Cryptographic Hashing (Password Cracking): In computer security, finding a password that matches a specific "hash" is computationally expensive (often impossible without brute force). However, if someone provides a candidate password, the system can verify it instantly by running the hash function once.
-
Math Competition Problems: Solving a complex geometry or algebra problem might take hours of creative thinking and derivation. However, if you are given a proposed final answer (and potentially the steps), plugging the numbers back into the original equation to see if they hold true is often much faster.
-
Lock Picking vs. Key Usage: Physically, "solving" a lock without a key is a difficult skill requiring time and manipulation. "Verifying" the solution (using the correct key) is instantaneous—the lock either turns or it doesn't.
Tasks that are easier to solve than verify
-
Generative Text / Essays (Brandolini’s Law): It is very fast to write a convincing-sounding essay or blog post filled with statistics. It takes an order of magnitude more time for a human to verify that every fact, citation, and figure in that essay is actually correct.
-
Scientific Hypotheses: it is incredibly easy to propose a new diet (e.g., "Eating only blueberries improves memory"). It takes years of rigorous clinical trials, control groups, and data analysis to verify whether that hypothesis is scientifically true.
-
Code Security: A developer can write a "solution" to a coding problem in minutes that compiles and runs. However, verifying that the code is completely secure and free of vulnerabilities (like memory leaks or edge-case bugs) is much harder and often technically non-trivial.
-
Legal Accusations: In a courtroom setting, it is often easier to invent a narrative or "theory of the crime" (the solution) than it is to verify it through the collection of forensic evidence, witness testimony, and cross-examination.
-
Predictions: It is easy to generate a prediction (e.g., "This stock will double in value by next year"). Verifying this solution is impossible in the present; it requires waiting for the passage of time to see if the prediction materializes.
This has an interesting implication on the industries that are going to be automated by AI. If industry product verification cycle is long and requires manual labour, than the industry will be automated when their verification asymetry is either reduced or simulated well.
Industries to automate much later
-
Pharmaceuticals (Drug Discovery): A chemist can design a new molecular compound (the solution) in a day. However, verifying that the drug is safe and effective for humans requires a decade of clinical trials costing billions of dollars. Although there are labs like Isomorphic Lab that are aiming at increasing the speed of the process.
-
Civil Engineering (Infrastructure): Pouring a concrete bridge deck is a straightforward solution. Verifying the structural integrity and long-term fatigue resistance of that bridge over a 50-year lifespan is a massive undertaking involving sensors and periodic inspections.
-
Aerospace (Component Manufacturing): Manufacturing a single turbine blade for a jet engine is automated and fast. Verifying that the blade has zero microscopic fissures—which could cause a catastrophic failure mid-flight—requires expensive X-ray and ultrasonic testing.
-
Environmental Policy (Carbon Offsets): A company can easily claim to be "carbon neutral" by purchasing offsets. Verifying that those trees were actually planted, are still alive, and wouldn't have been planted anyway (additionality) is an ongoing global monitoring challenge.
-
Deep-Sea and Space Exploration: Proposing a mission or a landing site is a matter of calculation. Verifying the actual conditions of that environment (e.g., checking for life under the ice of Europa) is exponentially more difficult than the theoretical plan.
-
Food Safety (Supply Chain): A factory can produce thousands of jars of peanut butter daily. Verifying that every single jar is free of Salmonella or heavy metals requires complex sampling and lab work that lags far behind production speed.
-
Academic Peer Review: A researcher can write a paper in a few months. Verifying that the data isn't fraudulent and that the experiments are replicable often takes the scientific community years of follow-up study, especially for modern Physics.
So it seems like we need some foundational work in these industries to enable the progress there quicker.
How can we enable easier/quicker verification for these industries?
December 2025

