1000..1100 Modeling at Scale in Systematic Trading
- Scott Clark, CEO, SigOpt
- Nick Payton(organizer)
- MOE, Metric Optimization Engine
- from Cornell Univ
- optimization
- Bayesian global optimization
Asset Management
- optimization
- The Quants Run Wall Street Now => WSJ
- $300B+ Assets under managements
- sigopt
- two sigma(asset manager) <=> sigopt
Lessons
- Invest in a reproducible process
- Balance flexibility with standardization
- Dividde labor between humans & machines
- Maximize resource utilization
- Prioritize performance (broadly)
1. Invest a reproducible process
- 5 pillars: data, modeling, simulation, optimization, execution
 
- Data
- historical stock prices
- company data
- comapny news
- social data
- location data
- satellite data (what shows in the parking lot)
 
- modeling
  - picking the right tool for the job
 
- simulation
- backtest must avoid:
- overfitting bias
- look ahread bias
- survivorship bias
- p-hacking bias
- metric bias
 
- defining the methods you trust
 
- backtest must avoid:
- optimization
  - how to tune the hyper permimter
- academic: grid search, random search
- particle based methods
 
- execution
- once you have a model you trust
- high frequency trading
- market making
- statistical arbitrage
- rebalancing
- portfolio optimization
 
Balance flexibility with standardization
- how to continue the advance
   
- framework:
- solutions: standard or properietary per firm?
- innovation: incremental or existential for firm?
- status: still evolving or fully established?
 
- modeling:
- sklearn, pytorch, tensorflow
 

3. Divide labor between humans & machines
- what humans are bad at:
- high dimentional optimization
- tuning the knobs on the airplane
 
- Hyperparameter Optimization
 
- Optimizing Optimizor problem
- focus on your model, leave hyperparameter optimization to us
- pic 8, pros and cons with different approach
 
- random search -> nvidia
- evolutionary algo -> google
- bayesian optimization
- you focus on data, model and backtest
- what is sent: learning rate, number of hidden layers
4. Maximize resource utilization
- build or buy
 
- well maint product:
- apache/spark, 19443 stars
- sheffieldml/gpyopt => 369 stars (sigopt competitor)
 
- 2 sigma
- asynchronous parallelization is critical for resource utilization
 
5. Prioritize performance (broadly defined)
- performance (table stakes)
- better, faster, cheaper
 
- pic 11. two sigma better result 8x faster
 
- paper: https://sigopt.com/research
- A stratified analysis of bayseian optimization methods
- helping real world problems
- car classification problem
- stanford dataset
- tuning the hyperpa
 
 
- pic 12, performance table
     
- entirely new capabilities
Thank you
- https://sigopt.com/company/careers
- https://sigopt.com/blog
- https://sigopt.com/research
- https://sigopt.com/try-it
- blackbox optimization problem











