Modeling at Scale in Systematic Trading

1000..1100 Modeling at Scale in Systematic Trading

  • Scott Clark, CEO, SigOpt
  • Nick Payton(organizer)
  • MOE, Metric Optimization Engine
  • from Cornell Univ
  • optimization
  • Bayesian global optimization

Asset Management

  • optimization
  • The Quants Run Wall Street Now => WSJ
  • $300B+ Assets under managements
  • sigopt
  • two sigma(asset manager) <=> sigopt

Lessons

  1. Invest in a reproducible process
  2. Balance flexibility with standardization
  3. Dividde labor between humans & machines
  4. Maximize resource utilization
  5. Prioritize performance (broadly)

1. Invest a reproducible process

  • 5 pillars: data, modeling, simulation, optimization, execution
  • Data
    • historical stock prices
    • company data
    • comapny news
    • social data
    • location data
    • satellite data (what shows in the parking lot)
  • modeling
    • picking the right tool for the job
  • simulation
    • backtest must avoid:
      • overfitting bias
      • look ahread bias
      • survivorship bias
      • p-hacking bias
      • metric bias
    • defining the methods you trust
  • optimization
    • how to tune the hyper permimter
    • academic: grid search, random search
    • particle based methods
  • execution
    • once you have a model you trust
    • high frequency trading
    • market making
    • statistical arbitrage
    • rebalancing
    • portfolio optimization

Balance flexibility with standardization

  • how to continue the advance
  • framework:
    • solutions: standard or properietary per firm?
    • innovation: incremental or existential for firm?
    • status: still evolving or fully established?
  • modeling:
    • sklearn, pytorch, tensorflow

3. Divide labor between humans & machines

  • what humans are bad at:
    • high dimentional optimization
    • tuning the knobs on the airplane
  • Hyperparameter Optimization
  • Optimizing Optimizor problem
  • focus on your model, leave hyperparameter optimization to us
  • pic 8, pros and cons with different approach
  • random search -> nvidia
  • evolutionary algo -> google
  • bayesian optimization
  • you focus on data, model and backtest
  • what is sent: learning rate, number of hidden layers

4. Maximize resource utilization

  • build or buy
  • well maint product:
    • apache/spark, 19443 stars
    • sheffieldml/gpyopt => 369 stars (sigopt competitor)
  • 2 sigma
    • asynchronous parallelization is critical for resource utilization

5. Prioritize performance (broadly defined)

  • performance (table stakes)
  • better, faster, cheaper
  • pic 11. two sigma better result 8x faster
  • paper: https://sigopt.com/research
    • A stratified analysis of bayseian optimization methods
    • helping real world problems
    • car classification problem
      • stanford dataset
      • tuning the hyperpa
  • pic 12, performance table
  • entirely new capabilities

Thank you

  • https://sigopt.com/company/careers
  • https://sigopt.com/blog
  • https://sigopt.com/research
  • https://sigopt.com/try-it
  • blackbox optimization problem
COMMENTS
Related Post