Machine Learning for Process Time Prediction in Multi-Chamber Tools
Historical averages miss the structured variation that drives cycle time surprises.

Most fabs estimate process time using historical averages. For many applications, this works fine. When variation is mostly random, the average is as good a predictor as anything else.
But in complex multi-chamber and cascading tools, variation is not mostly random. It is structured, and a significant portion of it is predictable if you have the right features.
Why Averages Fail in Multi-Chamber Environments
Two lots running the same recipe on the same tool can still experience meaningfully different process times. The difference depends on the specific chamber path assigned, the concurrent activity on other chambers, contention for shared process modules, and the configuration of the tool at the time the lot runs. None of that shows up in a recipe-level historical average.
The gap between the average and the actual is not just a scheduling inconvenience. It affects cycle time performance, capacity planning accuracy, and the quality of WIP scheduling decisions downstream. When the schedule is built on process time estimates that do not reflect actual tool behavior, surprises accumulate and the fab runs less predictably than it should.
See INFICON at SEMICON West
Visit Booth 5268 to discover how INFICON solutions reduce yield loss, shorten cycle times, provide actionable process insights, and maximize equipment utilization.
How the ML Pipeline Works
INFICON has developed a machine learning process-time prediction pipeline that addresses this directly. The system uses standard start and end tool events already available in factory data systems, making it practical to deploy without new instrumentation or data infrastructure. Its predictions are made available to the INFICON Operations Digital Twin and used by the INFICON Factory Scheduler to improve WIP scheduling across the fab.
The approach builds on INFICON's ATCAR system, Automatic Throughput Calculation and Retrieval, which summarizes historical throughput by key features including tool, toolset, and recipe. Rather than replacing that proven baseline, the ML pipeline complements it. ATCAR remains valuable as a fallback and for cases where context does not strongly drive variation. Machine learning adds value where it matters most: cases where process time depends significantly on recent and concurrent tool activity.
The core insight is representation. The pipeline constructs features that capture what is happening around the target lot: recently completed lots, concurrently running lots, chamber usage patterns, recipe context, and lot characteristics. For multi-chamber tools, the model includes a chamber-contention feature that estimates how much the target lot competes with overlapping lots for the same process modules. That feature captures a major source of predictable variation that is entirely invisible inside a high-level average.
Models are trained using gradient-boosting methods and evaluated against the ATCAR baseline. Results show consistent improvement on tools where chamber contention and operating context are important drivers of process-time variation. The system handles missing values, new category labels, model tracking, and performance reporting as production requirements, not afterthoughts.
The Operational Impact
Accurate per-lot process-time estimates change what the scheduler can do. With better estimates, the scheduler builds tighter schedules, creates more accurate cycle-time projections, and makes better decisions about WIP prioritization and tool assignment. The improvement compounds across the factory as better individual estimates produce better planning decisions at every step.
This work also represents a concrete example of the path from factory data to manufacturing performance. The features the model uses are already collected. The data infrastructure already exists in most modern fabs. What changes is the analytic layer between raw factory events and actionable scheduling inputs. Machine learning fills that gap in a way that simple statistics cannot.
At advanced nodes, where process complexity is increasing and the tolerance for cycle-time surprises is decreasing, that gap is exactly where the next generation of scheduling capability needs to be built.