Doing Data Science: Straight Talk from the Frontline by Rachel Schutt, Cathy O'Neil

By Rachel Schutt, Cathy O'Neil

Now that folks are conscious that information could make the adaptation in an election or a company version, info technological know-how as an profession is gaining flooring. yet how will you start operating in a wide-ranging, interdisciplinary box that’s so clouded in hype? This insightful publication, in keeping with Columbia University’s advent to facts technological know-how classification, tells you what you want to know.

in lots of of those chapter-long lectures, info scientists from businesses similar to Google, Microsoft, and eBay proportion new algorithms, tools, and versions by way of providing case stories and the code they use. If you’re accustomed to linear algebra, chance, and information, and feature programming event, this ebook is a perfect advent to facts science.

issues include:
• Statistical inference, exploratory info research, and the information technology procedure
• Algorithms
• unsolicited mail filters, Naive Bayes, and knowledge wrangling
• Logistic regression
• monetary modeling
• advice engines and causality
• facts visualization
• Social networks and knowledge journalism
• info engineering, MapReduce, Pregel, and Hadoop

Doing information Science is collaboration among path teacher Rachel Schutt, Senior vice chairman of knowledge technology at information Corp, and information technology advisor Cathy O’Neil, a senior info scientist at Johnson study Labs, who attended and blogged in regards to the course.

Show description

Read or Download Doing Data Science: Straight Talk from the Frontline PDF

Best nonfiction books

HIV/AIDS Treatment Drugs (Drugs: the Straight Facts)

A part of the wonderful medications: The instantly evidence sequence, HIV/AIDS therapy medicines teaches younger grownup readers the main updated information regarding the medicine used to struggle the AIDS epidemic this day. Chapters supply a uncomplicated heritage of the way those medicinal drugs have been built in the course of the conflict opposed to AIDS and talk about types of medicinal drugs intensive, in addition to how they paintings, and the way medical professionals create particular mixtures to enhance their effectiveness.

Sams Teach Yourself Samsung GALAXY Tab ™ in 10 Minutes

Sams train your self Samsung Galaxy Tab" in 10 mins bargains common, functional solutions for speedy effects. by means of operating throughout the 10-minute classes, you'll examine every thing you must comprehend to speedy and simply wake up to hurry at the Samsung GALAXY Tab. step by step directions stroll you thru the commonest questions, concerns, and initiatives.

Cities and Cinema (Routledge Critical Introductions to Urbanism and the City)

Films approximately towns abound. they supply fantasies if you happen to realize their urban and people for whom town is a remote dream or nightmare. How does cinema transform urban planners’ hopes and town dwellers’ fears of recent urbanism? Can an research of urban movies solution a number of the questions posed in city stories? What sorts of imaginative and prescient for the long run and pictures of the earlier do urban movies supply? What are the alterations that urban motion pictures have passed through?

Cities and Cinema places city idea and cinema stories in discussion. The book’s first part analyzes 3 very important genres of urban motion pictures that stick to in old series, each one linked to a specific urban, relocating from town movie of the Weimar Republic to the movie noir linked to la and a twin of Paris within the cinema of the French New Wave. the second one part discusses socio-historical issues of city reviews, starting with the connection of movie industries and person towns, carrying on with with the portrayal of battle torn and divided towns, and finishing with the cinematic expression of utopia and dystopia in city technology fiction. The final part negotiates the query of id and position in an international global, relocating from the portrayal of ghettos and barrios to the town as a surroundings for homosexual and lesbian wish, to finish with the illustration of the worldwide urban in transnational cinematic practices.

The publication means that modernity hyperlinks urbanism and cinema. It debts for the numerous adjustments that urban movie has gone through via strategies of globalization, in which town has constructed from an icon in nationwide cinema to a privileged web site for transnational cinematic practices. it's a key textual content for college students and researchers of movie reports, city reviews and cultural studies.

The Little Book of Energy Medicine: The Essential Guide to Balancing Your Body's Energies

The Little ebook of strength medication is an easy, easy-to-use "pocket guide" to 1 of the main strong replacement future health practices in life this present day, from world-renowned healer Donna Eden. during this booklet, Eden attracts on greater than 3 a long time of expertise to supply readers an easy advent to the middle strength medication routines she recommends for feeling rejuvenated, happier, extra alert, and not more fearful.

Additional resources for Doing Data Science: Straight Talk from the Frontline

Sample text

Let’s agree that there’s a spectrum, that one authority doesn’t feel right, and that “the masses” doesn’t either. So what about a clustering algorithm? How about we look at practitioners of data science and see how they describe what they do (maybe in a word cloud for starters)? Then we can look at how people who claim to be other things like statisticians or physicists or economists describe what they do. From there, we can try to use a clustering algorithm (which we’ll use in Chapter 3) or some other model and see if, when it gets as input “the stuff someone does,” it gives a good prediction on what field that person is in.

And sadly, this is where you’ll find the least guidance in textbooks, in spite of the fact that it’s the key to the whole thing. After all, this is the part of the modeling process where you have to make a lot of assumptions about the underlying structure of reality, and we should have standards as to how we make those choices and how we explain them. But we don’t have global standards, so we make them up as we go along, and hope‐ fully in a thoughtful way. We’re admitting this here: where to start is not obvious.

Really? We don’t think so, and we don’t think you’ll think so either by the end of the book. But the sentiment is similar to the Cukier and Mayer-Schoenberger article we just discussed about N=ALL, so you might already be getting a sense of the profound confusion we’re wit‐ nessing all around us. To their credit, it’s the press that’s currently raising awareness of these questions and issues, and someone has to do it. Even so, it’s hard to take when the opinion makers are people who don’t actually work with data.

Download PDF sample

Rated 4.80 of 5 – based on 45 votes