Big Data: SSD’s, R, and Linked Data Streams
The Solid State Storage Revolution: If you haven’t seen it, I recommend you watch Andy Bechtolsheim’s keynote at the recent Mysqlconf. We covered SSD’s in our just published report on Big Data...
View ArticleMATLAB, R, and Julia: Languages for data analysis
Big data frameworks like Hadoop have received a lot of attention recently, and with good reason: when you have terabytes of data to work with — and these days, who doesn’t? — it’s amazing to have...
View ArticleData Science tools: Are you “all in” or do you “mix and match”?
An integrated data stack boosts productivity As I noted in my previous post, Python programmers willing to go “all in”, have Python tools to cover most of data science. Lest I be accused of...
View ArticleWhat I use for data visualization
Depending on the nature of the problem, data size, and deliverable, I still draw upon an array of tools for data visualization. As I survey the Design track at next month’s Strata conference, I see...
View ArticleFour short links: 5 July 2011
Conference Organisers Handbook — accurate guide to running a two-day 300-person conference. See also Yet Another Perl Conference guidelines. Twitter Shifting More Code to JVM — interesting how, at...
View ArticleFour short links: 24 August 2012
Speak Like a Pro (iTunes) — practice public speaking, and your phone will rate your performance and give you tips to improve. (via Idealog) If Hemingway Wrote Javascript — glorious. I swear I marked...
View ArticleR as a Programming Language
Garrett Grolemund is an O’Reilly author and teaches classes on data analysis for R Studios. We sat down to discuss why data scientists, statisticians, and programmers alike can use the R language to...
View ArticleScaling People, Process, and Technology with Python
NOTE: If you are interested in attending OSCON to check out Dave’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your...
View ArticleA Hands-on Introduction to R
R is an open-source statistical computing environment similar to SAS and SPSS that allows for the analysis of data using various techniques like sub-setting, manipulation, visualization and modeling....
View ArticleFour short links: 25 October 2013
Seagate Kinetic Storage — In the words of Geoff Arnold: The physical interconnect to the disk drive is now Ethernet. The interface is a simple key-value object oriented access scheme, implemented...
View ArticleFour short links: 5 December 2013
Deducer — An R Graphical User Interface (GUI) for Everyone. Integration of Civil Unmanned Aircraft Systems (UAS) in the National Airspace System (NAS) Roadmap (PDF, FAA) — first pass at regulatory...
View ArticleScaling up data frames
Long before the advent of “big data,” analysts were building models using tools like R (and its forerunners S/S-PLUS). Productivity hinged on tools that made data wrangling, data inspection, and data...
View ArticleBuilding pipelines to facilitate data analysis
In every data analysis, you have to string together many tools. You need tools for data wrangling, visualisation, and modelling to understand what’s going on in your data. To use these tools...
View ArticleFour short links: 15 September 2014
The Care and Feeding of Weird Machines Found in Executable Metadata (YouTube) — talk from 29th Chaos Communication Congress, on using tricking the ELF linker/loader into arbitrary computation from the...
View Article