香港赛马会彩券管理局

A Quick and Tidy Look at the 2018 GSS

March 22, 2019
By
A Quick and Tidy Look at the 2018 GSS

The data from the 2018 wave of the General Social Survey was released during the week, leading to a flurry of graphs showing various trends. The GSS is one of the most important sources of information on various aspects of U.S. society. One of the best things about it is that the data is freely available for more than...

Read more »

AFL teams Elo ratings and footy-tipping by @ellis2013nz

March 22, 2019
By
AFL teams Elo ratings and footy-tipping by @ellis2013nz

So now that I live in Melbourne, to blend in with the locals I need to at least vaguely follow the AFL (Australian Football League). For instance, my work like many others has an AFL footy-tipping competition. I was initially going to choose my tips ba...

Read more »

Human Face Detection with R

March 22, 2019
By
Human Face Detection with R

Doing human face detection with computer vision is probably something you do once unless you work for police departments, you work in the surveillance industry or for the Chinese government. In order to reduce the time you lose on that small exercise, bnosac created a small R package (source code available at https://github.com/bnosac/image) which wraps the weights of a...

Read more »

How to Speed Up Gradient Boosting by a Factor of Two

March 22, 2019
By
How to Speed Up Gradient Boosting by a Factor of Two

Our latest tool development at STATWORX: random boost, an algorithm twice as fast as gradient boosting, with comparable prediction performance. Der Beitrag How to Speed Up Gradient Boosting by a Factor of Two erschien zuerst auf STATWORX.

Read more »

How long since your team scored 100+ points? This blog’s first foray into the fitzRoy R package

March 21, 2019
By
How long since your team scored 100+ points? This blog’s first foray into the fitzRoy R package

When this blog moved from bioinformatics to data science I ran a Twitter poll to ask whether I should start afresh at a new site or continue here. “Continue here”, you said. So let’s test the tolerance of the long-time audience and celebrate the start of the 2019 season as we venture into the world … Continue reading How...

Read more »

RStudio Connect 1.7.2

March 21, 2019
By
RStudio Connect 1.7.2

RStudio Connect 1.7.2 is ready to download, and this release contains some long-awaited functionality that we are excited to share. Several authentication and user-management tooling improvements have been added, including the ability to change authentication providers on an existing server, new group support options, and the official introduction of SAML as a supported authentication provider (currently a beta feature*). But that’s not all… keep...

Read more »

Upcoming talks in spring 2019

March 21, 2019
By
Upcoming talks in spring 2019

This spring, I’ll be giving talks at a couple of Meetups and conferences: March, 26th: At the data lounge Bremen, I’ll be talking about Explainable Machine Learning April, 11th: At the Data Science Meetup Bielefeld, I’ll be talking about Bu...

Read more »

lconnect connectivity metrics

March 21, 2019
By

In our package lconnect we use the Integral Index of connectivity to obtain patch importance, but several other metrics are currently available. A description of each of this metrics can be found below. For more information about each metric please see the references provided. At the end of the post an example using the function … Continue reading lconnect...

Read more »

Integrating Qlik Sense and R

March 21, 2019
By
Integrating Qlik Sense and R

Components Qlik Sense is a tool for exploratory data analysis and visualisation. It’s powerful and versatile. It’s can, however, be significantly enhanced by interfacing with R. Qlik Sense does not currently integrate directly with R. However, it’s not too tricky to get the two systems talking to each other. We’ll need two things to make this happen: Rserve — A TCP/IP...

Read more »

Package lconnect: patch connectivity metrics and patch prioritization

March 20, 2019
By
Package lconnect: patch connectivity metrics and patch prioritization

Today we are presenting a new package,?lconnect. This package is intended to be a very simple approach to derive landscape connectivity metrics. Many of these metrics come from the interpretation of landscape as graphs. Additionally, it also provides a function to prioritize landscape patches based on their contribution to the overall landscape connectivity. For now … Continue reading Package...

Read more »

How to Avoid Publishing Credentials in Your Code

March 20, 2019
By
How to Avoid Publishing Credentials in Your Code

Roland Stevenson is a data scientist and consultant who may be reached on Linkedin. When accessing an API or database in R, it is often necessary to provide credentials such as a login name and password. You may find yourself being prompted with something like this: When writing an R script that requires a user to provide credentials, you will want...

Read more »

All Around The World: Maps and Flags in R

March 20, 2019
By
All Around The World: Maps and Flags in R

Our lab is international. People born all over the world have come to work in my group. I’m proud of this fact, especially in the current political climate. I’ve previously used the GoogleMaps API to display a heat map on our lab webpage. It shows where in the world people in the lab come from.

Read more »

RSAGA 1.0.0

March 19, 2019
By

RSAGA 1.0.0 has been released on CRAN. The RSAGA package provides an interface between R and the open-source geographic information system SAGA, which offers a variety of geoscientific methods to analyse spatial data. SAGA GIS is supp...

Read more »

Best Fantasy Player of All Time

March 19, 2019
By
Best Fantasy Player of All Time

One of the things I have always wondered about AFL fantasy is just who is the best fantasy player of all time? Not the fan who wins the most but who is the best player. So one possible idea would be to work out the fantasy scores of players going back for all the time that is possible (YAY fitzRoy!)....

Read more »

Pivoting data frames just got easier thanks to `pivot_wide()` and `pivot_long()`

Pivoting data frames just got easier thanks to `pivot_wide()` and `pivot_long()`

There’s a lot going on in the development version of {tidyr}. New functions for pivoting data frames, pivot_wide() and pivot_long() are coming, and will replace the current functions, spread() and gather(). spread() and gather() will remain in the package though: You may have heard a rumour that gather/spread are going away. This is simply not true (they’ll stay around forever) but I...

Read more »

Data Science Software Reviews: Forrester vs. Gartner

March 19, 2019
By
Data Science Software Reviews: Forrester vs. Gartner

In my previous post, I discussed Gartner's reviews of data science software companies. In this post, I show Forrester's coverage and discuss how radically different it is. As usual, this post is already integrated into my regularly-updated article,?The Popularity of Data Science Software. Continue reading →

Read more »

The importance of Graphing Your Data – Anscombe’s Clever Quartet!

March 19, 2019
By
The importance of Graphing Your Data – Anscombe’s Clever Quartet!

Francis Anscombe's seminal paper on "Graphs in Statistical" analysis (American Statistician, 1973) effectively makes the case that looking at summary statistics of data is insufficient to identify the relationship between variables. He demonstrates this by generating four different data sets (Anscombe's quartet) which have nearly identical summary statistics. His data have the same mean and variance for x...

Read more »

R and labelled data: Using quasiquotation to add variable and value labels #rstats

March 19, 2019
By

Labelling data is typically a task for end-users and is applied in own scripts or functions rather than in packages. However, sometimes it can be useful for both end-users and package developers to have a flexible way to add variable and value labels to their data. In such cases,?quasiquotation?is helpful. This vignette demonstrate how to … Weiterlesen R and...

Read more »

Tidyverse users: gather/spread are on the way out

March 19, 2019
By
Tidyverse users: gather/spread are on the way out

From https://twitter.com/sharon000/status/1107771331012108288: From https://tidyr.tidyverse.org/dev/articles/pivot.html: There are two important new features inspired by other R packages that have been advancing of reshaping in R: The reshaping operation can be specified with a data frame that describes precisely how metadata stored in column names becomes data variables (and vice versa). This is inspired by the cdata package … Continue reading Tidyverse...

Read more »

Learning Data Science: Predicting Income Brackets

March 19, 2019
By
Learning Data Science: Predicting Income Brackets

As promised in the post Learning Data Science: Modelling Basics we will now go a step further and try to predict income brackets with real world data and different modelling approaches. We will learn a thing or two along the way, e.g. about the so-called Accuracy-Interpretability Trade-Off, so read on… The data we will use … Continue reading "Learning...

Read more »

Assumptions Matter More Than Dependencies

March 18, 2019
By

There’s been alot of talk about “dependencies” in the R universe of late. This is not really a post about that but more of a “really, don’t do this” if you decide you want to poke the dependency bear by trying to build a deeply flawed model off of CRAN package metadata. CRAN packages undergo... Continue reading →

Read more »

Using Scoped dplyr verbs

March 18, 2019
By

Introduction Over the past several months, I have really started to increase the amount that I have been using scoped dplyr verbs. For those of you who don’t know about these functions, they are handy variants to the normal dplyr verbs, such as filter, mutate, and summarize, that allow you to target multiple columns or all of your columns. These...

Read more »

The Credibility Crisis in Data Science

March 18, 2019
By
The Credibility Crisis in Data Science

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Skipper Seabold, a Director of Data Science at Civis Analytics. Introducing Skipper Seabold Hugo: Hi there, Skipper, and welcome to Data Framed. Skipper: Thanks. Happy to...

Read more »

RStudio Connect Quickstart

March 18, 2019
By
RStudio Connect Quickstart

RStudio have recently announced ‘RStudio Connect QuickStart’ which is a VM containing a full suite of RStudio’s pro tools, available to be trialled for a 45 day period. RStudio Connect Quickstart allows R users and people exploring the idea of using R in production, a quick and easy way to set-up a full, production-like environment that contains all of...

Read more »

A gentle introduction to SHAP values in R

March 18, 2019
By
A gentle introduction to SHAP values in R

Opening the black-box in complex models: SHAP values. What are they and how to draw conclusions from them? With R code example!

Read more »

Quantifying R Package Dependency Risk

March 18, 2019
By
Quantifying R Package Dependency Risk

We recently commented on excess package dependencies as representing risk in the R package ecosystem. The question remains: how much risk? Is low dependency a mere talisman, or is there evidence it is a good practice (or at least correlates with other good practices)? Well, it turns out we can quantify it: each additional non-core … Continue reading Quantifying...

Read more »

Download and Plot Factor Returns from the Fama-French Research Data Library

March 18, 2019
By
Download and Plot Factor Returns from the Fama-French Research Data Library

CategoriesGetting Data Tags Data Management Plot R Programming Since the initial publication of the Three Factor Model by Eugene Fama and Kenneth French in their influential 1993 paper (Common Risk Factors in the Returns of Stocks and Bonds) a lot of academic research has been dedicated to the analysis of factors driving security returns. With the rise of quantitative investment management, this field Related...

Read more »

Handling & Sharing PCAPs Like a Boss with PacketTotal

March 17, 2019
By

The fine folks over at @PacketTotal bequeathed an API token on me so I cranked out an R package for it to enable more dynamic investigations work (RStudio makes for an amazing incident responder investigations console given that you can script in multiple languages, code in C, and write documentation all at the same time... Continue reading →

Read more »

Are R ecosystems the future?

March 17, 2019
By

Some random thoughts… Over the past 6 months I’ve been creating, refining, and delivering a variety of ‘Introduction to R’ training courses. The more I do this, the more I come to the view that not nearly enough is made of taking an ecosystem-oriented view to packages. A good way of talking about #rstats functionality is in terms of ecosystems, rather...

Read more »

Search R-bloggers


Sponsors

Mango solutions







Zero Inflated Models and Generalized Linear Mixed Models with R



wiley.com/learn/datascience

Quantide: statistical consulting and training

ODSC boston

http://www.eoda.de









Six Sigma Online Training

mljar.com

Our ads respect your privacy. Read our Privacy Policy page to learn more.

Contact us if you wish to help support R-bloggers, and place your banner here.

香港赛马会彩券管理局