1000..1100 Modeling at Scale in Systematic Trading
- Scott Clark, CEO, SigOpt
- Nick Payton(organizer)
- MOE, Metric Optimization Engine
- from Cornell Univ
- optimization
- Bayesian global optimization
Asset Management
- optimization
- The Quants Run Wall Street Now => WSJ
- $300B+ Assets under managements
- sigopt
- two sigma(asset manager) <=> sigopt
Lessons
- Invest in a reproducible process
- Balance flexibility with standardization
- Dividde labor between humans & machines
- Maximize resource utilization
- Prioritize performance (broadly)
1. Invest a reproducible process
- 5 pillars: data, modeling, simulation, optimization, execution
- Data
- historical stock prices
- company data
- comapny news
- social data
- location data
- satellite data (what shows in the parking lot)
- modeling
- picking the right tool for the job
- simulation
- backtest must avoid:
- overfitting bias
- look ahread bias
- survivorship bias
- p-hacking bias
- metric bias
- defining the methods you trust
- backtest must avoid:
- optimization
- how to tune the hyper permimter
- academic: grid search, random search
- particle based methods
- execution
- once you have a model you trust
- high frequency trading
- market making
- statistical arbitrage
- rebalancing
- portfolio optimization
Balance flexibility with standardization
- how to continue the advance
- framework:
- solutions: standard or properietary per firm?
- innovation: incremental or existential for firm?
- status: still evolving or fully established?
- modeling:
- sklearn, pytorch, tensorflow
3. Divide labor between humans & machines
- what humans are bad at:
- high dimentional optimization
- tuning the knobs on the airplane
- Hyperparameter Optimization
- Optimizing Optimizor problem
- focus on your model, leave hyperparameter optimization to us
- pic 8, pros and cons with different approach
- random search -> nvidia
- evolutionary algo -> google
- bayesian optimization
- you focus on data, model and backtest
- what is sent: learning rate, number of hidden layers
4. Maximize resource utilization
- build or buy
- well maint product:
- apache/spark, 19443 stars
- sheffieldml/gpyopt => 369 stars (sigopt competitor)
- 2 sigma
- asynchronous parallelization is critical for resource utilization
5. Prioritize performance (broadly defined)
- performance (table stakes)
- better, faster, cheaper
- pic 11. two sigma better result 8x faster
- paper: https://sigopt.com/research
- A stratified analysis of bayseian optimization methods
- helping real world problems
- car classification problem
- stanford dataset
- tuning the hyperpa
- pic 12, performance table
- entirely new capabilities
Thank you
- https://sigopt.com/company/careers
- https://sigopt.com/blog
- https://sigopt.com/research
- https://sigopt.com/try-it
- blackbox optimization problem