QuantNet is the premier site for keeping abreast of the latest developments in Mathematical Finance. It provides articles from industry experts that can help you get a job and be successful in your career. I endeavor to live up to the quality Andy expects of all his contributors.
Businesses hire people to solve problems. Their main focus is not on the tools you use to do that, it is your job to learn available tools and advise them on their best application. Don’t fall into the trap of advocating a tool just because you figured out how to use it. Keep learning new tools so you are in a position to give the company you work for the most appropriate solution.
A friend of mine rides herd over the Python libraries used at Bank of America. One part of his job is to encourage software reuse, a well-established best practice. He shared his frustration with me about how difficult that is. People can somehow always find a reason why an existing library doesn’t do something they believe they need and insist on rewriting code from scratch. BofA has over a million Python libraries now. “Most of them written by people who had no business writing software in the first place,” according to my friend’s rather harsh assessment. Listen to people who have come before you. They want to teach you things you haven’t yet learned before you get old and start yelling at clouds.
I am fond of APL inspired languages. Kenneth Iverson invented A Programming Language in 1960 because he was appalled by the limited expressiveness of Fortran. The latest incarnation of APL is kdb. Every major hedge fund forks over a hefty licensing fee to Arthur Whitney year after year to use that. During my stint at Morgan Stanley I used an earlier version called A+. It was the most productive language for turning keystrokes into programs I have ever experienced.
Structure and Interpretation of Computer Programs is a renown MIT course that uses Scheme, a simple dialect of Lisp. There is even an edition for the course textbook, aka “The Wizard Book”, that uses JavaScript1.
Robin Milner laid down the theory behind functional programming. Prior to his work, programming languages automated operations on text, integers and floating point values. Functional languages operate on functions as first class values. Milner’s work is based on Alonso Church’s lambda calculus, Alan Turing’s thesis adviser2. Languages based on Milner’s ideas include Haskell, SML and OCaml (used by Jane Capital), F#, and the most popular functional programming language, by any measure, in the world.
I wrote the xll library to make it simple use the ancient Microsoft Excel C SDK to plug C, C++, and even Fortran into Excel. The guilty secret of Python is that its most popular packages are written in C or C++.
You will need to learn about header files, libraries, compiling, and linking to use this. The authors of Python packages use the same knowledge. C++ is complicated but some of the smartest people in the world devote their free time to keep improving it. Especially the guy who created the language. Bjarne Stroustrup made sure no single company would be able to own C++ by offering his baby to the International Standards Organization.
The xlladdins GitHub organization has many examples of how to call C and C++ from Excel.
The xll_math repository uses Python to write Excel add-ins by scraping web documentation. I work hard to be lazy.
One thing I learned from that exercise is to use etree instead of the standard Python HTML parser. Etree uses the C libraries libxml2 and libxslt. They are battle tested high performance libraries for dealing with the XML and HTML code you might come across in the internet wild.
That inspired me to write xll_inet for Excel. Modern Excel has the WEBSERVICE(url) function that returns the raw source given a URL as a string, but Excel strings are limited to 32KB. The function \URL.VIEW(url) returns a handle to all the gigabytes that can fit into RAM. The characters returned can be accessed using VIEW(handle, offset, count).
The most used relational database is sqlite. The xll_sqlite add-in embeds that in Excel.
The FP data type is the key to high performance numerical computing in Excel. Only the Excel C SDK allows passing a pointer to an array of floating point numbers instead of copying data. Need to do a 1000x1000 Cholesky decomposition? Call the LAPACK function dpotrf from an Excel add-in. It is not as easy as using linalg.cholesky but if you care about performance it can be 100 times faster.
I use the xll library for the derivative security courses I teach. There is a reason the production applications traders are required to use have an ‘Export to Excel’ button. If you hand a trader a spreadsheet then you are in their world. They will poke at it and tell you “this is right,” “that is wrong,” and “oh, by the way, can you also do…”? It is a great way to do agile development. You get a front row seat to the problems that matter to people running the business. Here is a GitHub link to code being developed this semester.
Dynamic Arrays are a thing now. Excel functions can return arrays. No need to spelunk with F2-Ctrl-Shift-Enter until you see #N/A and do Ctrl-Z to back out to the exact size of the returned array. The entire array now gets instantly plonked into your spreadsheet. It can be referenced using A1# where A1 is the upper left cell in the array.
The new LAMBDA function is based on Church’s Lambda Calculus. They hired some heavy hitters from the Haskell world to advise them. There are also many new array functions, having roots in APL, to take advantage of this.
Businesses hire people to solve problems. Their main focus is not on the tools you use to do that, it is your job to learn available tools and advise them on their best application. Don’t fall into the trap of advocating a tool just because you figured out how to use it. Keep learning new tools so you are in a position to give the company you work for the most appropriate solution.
A friend of mine rides herd over the Python libraries used at Bank of America. One part of his job is to encourage software reuse, a well-established best practice. He shared his frustration with me about how difficult that is. People can somehow always find a reason why an existing library doesn’t do something they believe they need and insist on rewriting code from scratch. BofA has over a million Python libraries now. “Most of them written by people who had no business writing software in the first place,” according to my friend’s rather harsh assessment. Listen to people who have come before you. They want to teach you things you haven’t yet learned before you get old and start yelling at clouds.
Languages
I like Python, but I don’t love it. It is great for one-off programs, but not for large projects. It was not designed for versioned software modules and it’s quirky pass-by-object-reference semantics can lead to difficult to debug problems. Programming languages are tools, not religions. It will benefit your career to learn at least one other programming language well.I am fond of APL inspired languages. Kenneth Iverson invented A Programming Language in 1960 because he was appalled by the limited expressiveness of Fortran. The latest incarnation of APL is kdb. Every major hedge fund forks over a hefty licensing fee to Arthur Whitney year after year to use that. During my stint at Morgan Stanley I used an earlier version called A+. It was the most productive language for turning keystrokes into programs I have ever experienced.
Structure and Interpretation of Computer Programs is a renown MIT course that uses Scheme, a simple dialect of Lisp. There is even an edition for the course textbook, aka “The Wizard Book”, that uses JavaScript1.
Robin Milner laid down the theory behind functional programming. Prior to his work, programming languages automated operations on text, integers and floating point values. Functional languages operate on functions as first class values. Milner’s work is based on Alonso Church’s lambda calculus, Alan Turing’s thesis adviser2. Languages based on Milner’s ideas include Haskell, SML and OCaml (used by Jane Capital), F#, and the most popular functional programming language, by any measure, in the world.
Excel
Excel is a functional programming language that allows users to directly see the result of calling a function. The results can be used as arguments to other functions and get updated when inputs change. People love that. Bill Gates is no dummy, he saw the popularity of Visicalc and Lotus 1-2-3 and built a better product.I wrote the xll library to make it simple use the ancient Microsoft Excel C SDK to plug C, C++, and even Fortran into Excel. The guilty secret of Python is that its most popular packages are written in C or C++.
You will need to learn about header files, libraries, compiling, and linking to use this. The authors of Python packages use the same knowledge. C++ is complicated but some of the smartest people in the world devote their free time to keep improving it. Especially the guy who created the language. Bjarne Stroustrup made sure no single company would be able to own C++ by offering his baby to the International Standards Organization.
The xlladdins GitHub organization has many examples of how to call C and C++ from Excel.
The xll_math repository uses Python to write Excel add-ins by scraping web documentation. I work hard to be lazy.
One thing I learned from that exercise is to use etree instead of the standard Python HTML parser. Etree uses the C libraries libxml2 and libxslt. They are battle tested high performance libraries for dealing with the XML and HTML code you might come across in the internet wild.
That inspired me to write xll_inet for Excel. Modern Excel has the WEBSERVICE(url) function that returns the raw source given a URL as a string, but Excel strings are limited to 32KB. The function \URL.VIEW(url) returns a handle to all the gigabytes that can fit into RAM. The characters returned can be accessed using VIEW(handle, offset, count).
The most used relational database is sqlite. The xll_sqlite add-in embeds that in Excel.
The FP data type is the key to high performance numerical computing in Excel. Only the Excel C SDK allows passing a pointer to an array of floating point numbers instead of copying data. Need to do a 1000x1000 Cholesky decomposition? Call the LAPACK function dpotrf from an Excel add-in. It is not as easy as using linalg.cholesky but if you care about performance it can be 100 times faster.
I use the xll library for the derivative security courses I teach. There is a reason the production applications traders are required to use have an ‘Export to Excel’ button. If you hand a trader a spreadsheet then you are in their world. They will poke at it and tell you “this is right,” “that is wrong,” and “oh, by the way, can you also do…”? It is a great way to do agile development. You get a front row seat to the problems that matter to people running the business. Here is a GitHub link to code being developed this semester.
Modern Excel
Joel Spolsky told us You Suck at Excel. He makes many good points but Excel has moved on since then. Don’t underestimate how much money Satya has been plowing into one of his cash cows.Dynamic Arrays are a thing now. Excel functions can return arrays. No need to spelunk with F2-Ctrl-Shift-Enter until you see #N/A and do Ctrl-Z to back out to the exact size of the returned array. The entire array now gets instantly plonked into your spreadsheet. It can be referenced using A1# where A1 is the upper left cell in the array.
The new LAMBDA function is based on Church’s Lambda Calculus. They hired some heavy hitters from the Haskell world to advise them. There are also many new array functions, having roots in APL, to take advantage of this.
Your New Superpower
Plug your code, or a third party library, into Excel by writing a thin wrapper that gathers arguments from Excel, call any function, and return the result. Use the full power of Excel to explore and perfect your code. Anyone can use your handiwork by opening the self-contained .xll file you produce. Get started with xll_template. Clone it, open in Visual Studio, and press F5 to build and launch Excel with the add-in loaded.- Internet lore has it Marc Andreeson wrote JavaScript in 10 days. JavaScript has nothing to do with Java, but Andreeson gave it that name to ride the Java marketing hype from Sun Microsystems at the time. It took 10 years before Douglas Crockford figured out there was a pony in there.↩︎
- The lambda calculus and Turing machines are computationally equivalent. A quick A/B test will inform you one is more useful than the other.↩︎