The data frame can be modfied in place with the Julia convention to use exclamation mark (!) to denote a mutating function:
transform!(df, :age => (x ->2x) =>:age_doubled) # Modify df in place
3×4 DataFrame
Row
name
age
score
age_doubled
String
Int64
Int64
Int64
1
Alice
25
88
50
2
Bob
30
92
60
3
Charlie
35
95
70
We can select a subset of the variables
df_selected =select(df, :name, :age_doubled)
3×2 DataFrame
Row
name
age_doubled
String
Int64
1
Alice
50
2
Bob
60
3
Charlie
70
Or filter out observations:
df_filtered =filter(:age =>>(28), df)
2×4 DataFrame
Row
name
age
score
age_doubled
String
Int64
Int64
Int64
1
Bob
30
92
60
2
Charlie
35
95
70
DataFramesMeta.jl uses Julia macros to @chain (pipe) together data manipulations, inspired by tidyverse in R.
usingDataFramesMetadf_extra =DataFrame( name = ["Alice", "Bob", "Diana"], city = ["Stockholm", "Gothenburg", "Malmö"])result =@chain df begin@transform(:age_doubled =2.*:age)@subset(_, :score .>90) # pipes to first argument, here explicitly using _@select(:name, :score, :age_doubled) leftjoin(df_extra, on =:name) # normal DataFrames functionend
2×4 DataFrame
Row
name
score
age_doubled
city
String
Int64
Int64
String?
1
Bob
92
60
Gothenburg
2
Charlie
95
70
missing
TidierData.jl is a Julia re-implementation of the dplyr and tidyr packages from R.
Tidier.jl is meta package, similar to the tidyverse package in R.
usingCSV, DataFrames, Tidier# Read data from a URL using the standard library Downloadsurl ="https://github.com/mattiasvillani/Julia4Stats/raw/main/data/titanic.csv";http_response =Downloads.download(url);titanic = CSV.read(http_response, DataFrame)## TidierData.jl for data wrangling using @chain macrotitanic2 =@chain titanic begin@mutate(survived = survived ==1)@mutate(first_class = pclass ==1)@filter(fare >10)@select(name, survived, age, sex, first_class)end
Plots
Plots.jl - a meta plotting package with many backends
Here is an example using Plots.jl to plot the mtcars data from Rdatasets.jl
usingPlots, LaTeXStrings, RDatasets, GLMmtcars =dataset("datasets", "mtcars")# Make a scatter plot of Horsepower vs Miles per gallonscatter(mtcars.HP, mtcars.MPG, xlabel ="Horsepower", ylabel ="Miles per gallon", title ="MPG vs Horsepower", label ="Data points", legend =:topright, color =:blue)# Fit a linear model using GLM.jlusingGLMlm_model =lm(@formula(MPG ~ HP), mtcars)# Add the fitted line to the plot, note the mutating plot! functionplot!(mtcars.HP, predict(lm_model), label ="Fitted line", color =:red, linewidth =2) # Add a LaTeX string to the titleβhat =round.(coef(lm_model), digits =3)plot!(title = L"\beta_0 = %$(βhat[1])"*" and "* L"\beta_1 = %$(βhat[2])")
Some random examples from my teaching
Switching backends
Let us first plot a surface in Plots.jl:
usingPlots, LaTeXStrings# Plot a surface with the gr backendgr()xs =range(-4, 4; length=150)ys =range(-4, 4; length=150)f(x, y) =sin(x) *cos(y) *exp(-(x^2+ y^2)/8)Plots.surface(xs, ys, f; xlabel = L"x", ylabel = L"y", zlabel = L"f(x,y)", legend=false, camera = (30, 60))
Now we switch backend to PlotlyJS to get an interactive plot where we can pan, zoom and rotate:
importPlotlyJSplotlyjs() # swithing to plotlyjs for interactive PlotsPlots.surface(xs, ys, f; xlabel ="x", ylabel ="y", zlabel ="f(x,y)", legend=false, camera = (30, 60))