Bryan W. Lewis
I work with a very talented group of friends at a startup called
Paradigm4 on SciDB.
SciDB is an opensource arrayoriented database.
I prefer to forage, and I enjoy many mushrooms,
other
wild foods, and living simply.
Everyone working on scientific computing problems should consider using R, a wonderfully powerful and expressive system for computation and visualization.
Send electronic mail to me at:
blewis@illposed.net.
 I've been learning about clustering methods recently. Here is a link
to a simple hierarchical clustering implementation (<50 lines) that is
written only in R so it's easy to experiment with:
https://github.com/bwlewis/hclust_in_R
The native R hclust function in the statistics package is faster,
but includes lots of Fortran code.
 I wrote up a trivially simple implementation of and examples illustrating
Gene Golub's SVD subset
selection algorithm. Mike and I are using it in one of our GLM implementations
(see below). But it's a cool method and deserves more attention.
See http://bwlewis.github.io/GLM/svdss.html.

Mike and I have been writing down our working notes on generalized linear
models. Still incomplete and a bit rough, but maybe interesting to somebody...
Our focus is on numerics and performance. See http://bwlewis.github.io/GLM and the
associated project https://github.com/bwlewis/GLM.
 This is cool: http://xkcd.rforge.rproject.org/

http://bwlewis.github.io/cassini/ is
an fun little interactive illustration of the Gerschgorin's circles and Brauer's
ovals of Cassini eigenvalue inclusion theorems written using Javascript
and d3.js.
Please feel free to fork and use the code available on Github here:
https://github.com/bwlewis/cassini.
 I asked some questions about illposed problems and regularization
at Kent State recently. Here are the slides:
http://illposed.net/illposed_ksu_nov_2013.pdf.
The slides include a simple R program that
applies regularization to stock returns in order to cluster stocks
by a relevance network graph.
 I gave a talk with Jake VanderPlas about SciDB at PyData 2013 NYC. Here is a link to a Wakari notebook: http://goo.gl/ovGaHS

Model SR2 SVG Slide Rule, an SVGbased slide rule that can be scripted with Javascript.

The Redis client for R was recently updated! R package here on CRAN:
http://cran.rproject.org/web/packages/rredis/index.html
Source code here on GitHub:
https://github.com/bwlewis/rredis
And the package vignette (PDF):
redis.pdf
 Here are some relatively recent papers I really like:
Network analysis via partial spectral factorization and Gauss quadrature
In Search of an Understandable Consensus Algorithm
A Scalable Bootstrap for Massive Data
Quadrature RuleBased Bounds for Functions of Adjacency Matrices
Augmented Implicitly Restarted Lanczos Bidiagonalization Methods
OK, those last two are not so new, but they're supercool.
 I gave talk on tips and tricks for performance computing with R at the Cleveland R meetup on Wednesday, August 7th. Here are the slides: http://goo.gl/gcPezs. Perhaps the most interesting part shows that it's pretty easy to install the commercial but freely available AMD BLAS and LAPACK libraries for R on Windows and Linux.
 I gave a talk at the Boston PyData conference (http://pydata.org/) about SciDBPy  Jake Vanderplas' new interface between SciDB and Python. The interface defines a numpy/scipylike array class for Python backed by SciDB arrays. Install the package directly from GitHub with pip install git+ssh://github.com/jakevdp/scidbpy.git.
 I've just been reading Patrick Burns' book, http://www.burnsstat.com/documents/books/taoteprogramming/, and really enjoy it.
 I gave a talk on SciDB and R and Python at JSM on Sunday, August 4th. Here are the slides: http://goo.gl/A2RPkn.
 So you like Python muthaph*kkahz!?! You got it: https://github.com/bwlewis/irlbpy. This is the fastest
partial SVD and PCA routine for dense and sparse matrices available in Python.
It's
restricted right now to realvalued matrices and is still under active
development. Mike Kane presented
our work in progress at the SciPy Conference next week in Austin June 2428.
 I get a lot of questions about using the fast truncated SVD
irlba package, especially for large problems.
So, I've started a page of miscellaneous tips here:
irlba.

Whit Armstrong and I ran a seminar on high performance computing with R at the
R/Finance conference in May.
We emphasized elastic computing using 0MQ and Redis with R,
and a bit of parallel linear algebra with SciDB. Here
are the slides we used:
elasticrredis.pdf.
0MQ.distributed.computing.pdf.
SciDBRbrief.pdf.

doRedis.html, a parallel back end for the R language that uses Redis and foreach.
Here is the vignette documentation:
doRedis.pdf

The irlba package for
R provides a state of the art fast partial singular value decomposition. It's
suitable for very large scale problems and supports sparse and dense matrices.
To give you an idea how fast it is, one can compute a fivedimensional
principal components analysis (PCA) on the Netflix data set
(480,189 user IDs and 17,770 movies) in a few minutes on a dualcore notebook
(using R's sparse Matrix package).
 My lightning talk on SciDB and R for the Boston R meetup on 22Jan2013: goo.gl/btioG.
 I gave a talk at JSM about R and websockets. Here it is:
http://illposed.net/jsm2012.pdf
And, here is a nifty application of websockets and R in quant. finance:
http://timelyportfolio.blogspot.com/2012/07/hirandaxysimd3jsnicetomeetyou.html.
Here is a silly cool "chat" script for R using websockets (many web clients can share
a super basic R session):
http://illposed.net/rchat.R.
Joe Cheng over at RStudio has taken over active development
of the package.
 Slides about the R bigmemory, parallel linear algebra in R, and a preview of what I'm working on with R and SciDB from a recent talk at the Boston R Meetup:
http://illposed.net/boston_r_meetup_2012.pdf
 One new idea and one old idea that should be better known on the SVD and cointegration
(from a recent talk at R/Finance 2012):
Lewis_RFinance_2012.pdf.
 If you need to document anything, you should really consider
dexy.it.
 A data frame promise for R that very quickly extracts subsets directly from raw delimited text files:
lazy.frame.html.

A native HTML 5 Websocket library for R:
http://illposed.net/websockets.html

I discussed some methods other than Hadoop for analyzing large data with
the New York CTO club. My notes are available here:
http://goo.gl/PeJwm.

R is popular!
http://sites.google.com/site/r4statistics/popularity

If you like R, or think you might, you should check out
http://rstudio.org. I highly recommend it.

Outlaw talk: "The Betfair Package" at
R/Finance 2011: Applied Finance with R
Betfair is the world's largest betting exchange with more than three million
global clients. The BetfaiR package implements the Betfair Sports API in the R
language, providing direct access to the Betfair sports exchange from R. All
of the Betfair Sports API functions are available, including functions for real
time market data and user account access. The package also provides a number of
highlevel functions for sports betting analysis, modeling and graphics.
This was the first talk I ever gave where running the examples live would
require breaking the law.

http://rforge.net/RserveWin:
An experimental new Rserve binary R server desgined for improved functionality on Windows systems.
 Talk: "How good are Krylov methods for discrete illposed problems?," March 2528 AMS meeting in Lexington, KY: http://www.ms.uky.edu/~corso/amsmaa2010/.
Here are some slides:
AMS_Lex_March20101.pdf

pvshm.html: A Linux filesystem that
provides a memory mapping overlay for PVFS2 or other file systems lacking
memory mapping capability.

esperr, a package for streaming event processing for R.

http://github.com/bwlewis/fls, an implementation of KalabaTesfatsion flexible least squares method for R.

R4P, an R library for Processing.

http://github.com/bwlewis/iqfeed
is an Rlanguage interface package to the DTN IQ Feed API.

I've been playing with Google APIs:
A Particularly Silly Dictionary
GNU/Linux utilities for IQFeed (download) iqfeedutils.tar.bz2. A set of basic utilities to get DTN IQFeed up and running in Windowsfree GNU/Linux environments, as well as facilitate communication between Linux quant boxes and Windows IQFeed boxes. A new Redisbased utility is included that is very effective at processing level 1 market data.
bars: A companion stream processor for DTN IQ Feed that builds realtime minute bars from streaming market quote data.
Ratlab, tools for foolin' with R and Octave (or Matlab) together.
http://etna.math.kent.edu/vol.30.2008/pp128143.dir/zeros/index.html A newer Java applet illustrating the dynamical motion of the zeros of the partial sums of the exponential function (from work with Richard Varga and Amos Carpenter).
