Bryan W. Lewis
I work with a very talented group of friends at a start-up called
Paradigm4 on SciDB.
SciDB is an open-source (GPL) array-oriented database. Here is an example of
what you can do with it:
SciDB also works well with the R language, among others.
I prefer to forage, and I enjoy many mushrooms,
wild foods, and living simply.
Send electronic mail to me at:
- I get a lot of questions about using the fast truncated SVD
irlba package, especially for large problems.
So, I've started a page of miscellaneous tips here:
Whit Armstrong and I will be running a seminar on high performance computing with R at the
R/Finance conference in May.
We will emphasize elastic computing and large-scale linear algebra (including sparse problems).
A Redis client for R:
doRedis.html, a parallel back end for the R language that uses Redis and foreach.
Here is the vignette documentation:
The irlba package for
R provides a state of the art fast partial singular value decomposition. It's
suitable for very large scale problems and supports sparse and dense matrices.
To give you an idea how fast it is, one can compute a five-dimensional
principal components analysis (PCA) on the Netflix data set
(480,189 user IDs and 17,770 movies) in a few minutes on a dual-core notebook
(using R's sparse Matrix package).
- My lightning talk on SciDB and R for the Boston R meetup on 22-Jan-2013: goo.gl/btioG.
- I gave a talk at JSM about R and websockets. Here it is:
And, here is a nifty application of websockets and R in quant. finance:
Here is a silly cool "chat" script for R using websockets (many web clients can share
a super basic R session):
Joe Cheng over at RStudio has taken over active development
of the package.
- Slides about the R bigmemory, parallel linear algebra in R, and a preview of what I'm working on with R and SciDB from a recent talk at the Boston R Meetup:
- One new idea and one old idea that should be better known on the SVD and cointegration
(from a recent talk at R/Finance 2012):
- If you need to document anything, you should really consider
- A data frame promise for R that very quickly extracts subsets directly from raw delimited text files:
A native HTML 5 Websocket library for R:
I discussed some methods other than Hadoop for analyzing large data with
the New York CTO club. My notes are available here:
R is popular!
If you like R, or think you might, you should check out
http://rstudio.org. I highly recommend it.
Outlaw talk: "The Betfair Package" at
R/Finance 2011: Applied Finance with R
Betfair is the world's largest betting exchange with more than three million
global clients. The BetfaiR package implements the Betfair Sports API in the R
language, providing direct access to the Betfair sports exchange from R. All
of the Betfair Sports API functions are available, including functions for real
time market data and user account access. The package also provides a number of
high-level functions for sports betting analysis, modeling and graphics.
This was the first talk I ever gave where running the examples live would
require breaking the law.
An experimental new Rserve binary R server desgined for improved functionality on Windows systems.
- Talk: "How good are Krylov methods for discrete ill-posed problems?," March 25--28 AMS meeting in Lexington, KY: http://www.ms.uky.edu/~corso/amsmaa2010/.
Here are some slides:
pvshm.html: A Linux filesystem that
provides a memory mapping overlay for PVFS2 or other file systems lacking
memory mapping capability.
esperr, a package for streaming event processing for R.
http://github.com/bwlewis/fls, an implementation of Kalaba-Tesfatsion flexible least squares method for R.
R4P, an R library for Processing.
is an R-language interface package to the DTN IQ Feed API.
I've been playing with Google APIs:
A Particularly Silly Dictionary
GNU/Linux utilities for IQFeed (download) iqfeedutils.tar.bz2. A set of basic utilities to get DTN IQFeed up and running in Windows-free GNU/Linux environments, as well as facilitate communication between Linux quant boxes and Windows IQFeed boxes. A new Redis-based utility is included that is very effective at processing level 1 market data.
bars: A companion stream processor for DTN IQ Feed that builds real-time minute bars from streaming market quote data.
Ratlab, tools for foolin' with R and Octave (or Matlab) together.
http://etna.mcs.kent.edu/vol.8.1999/pp15-20.dir/gershini.html An old Java applet I wrote for Richard Varga that nicely illustrates Gershgorin discs and the ovals of Cassini.
http://etna.math.kent.edu/vol.30.2008/pp128-143.dir/zeros/index.html A newer Java applet illustrating the dynamical motion of the zeros of the partial sums of the exponential function (from work with Richard Varga and Amos Carpenter).