This last week I had the opportunity to attend my first PASS Summit in Seattle, and it was an awesome experience! I got to meet lots of great people, ran into some old colleagues, and learned about the latest tools and technologies. I’ll summarize my first 2 days at the conference in this post and provide some links to the content/presentations which I found useful. I’ll follow up with another post summarizing the regular conference sessions.
Applied Data Science for the SQL Server Professional
This year I attended two days of pre-conference sessions. The first day I switched between Azure Infrastructure and Applied Data Science for the SQL Server Professional. The Azure Infrastructure session was more in-depth and on a different path than what I had in mind, and had a lot of content that I didn’t see myself using on a day-to-day basis so I didn’t spend a lot of time on this one. The Applied Data Science was a more high-level overview of the different tools that you can leverage to do data science and provided some useful information that I was able to walk away with.
Azure Machine Learning Studio
This is a tool that you can use to create your machine learning models, and best of all, it is free. You can use up to 10GB of storage without having to sign up or pay anything.
The idea here is that you can use this tool to create your model, validate your results and make sure that it works, and then take the scripts and build it in R or Python. You can then take that into your SQL Server environment and run your models inside SQL Server.
There are lots of demos available as well in the Cortana Gallery that you can use to learn about the different algorithms. If you want to check out an intro video to Cortana Intelligence, check out this tutorial by Buck Woody.
Some other takeaways from this session (might be common sense for some of you 🙂 ):
- Steps for prepping your data for machine learning: select columns, categorize data, clean missing data, split data.
- Use linear regression to determine if two items have a positive or negative correlation with each other.
- Python will win out over R (personally, I think they will both continue to have their place).
- If you are using an Azure Machine Learning Gateway, be aware that this does not play nice with the Power BI Gateway, meaning you can’t have both on the same machine.
- Jupyter is a good tool for writing out and testing your Python scripts.
Visual Data Storytelling
The second day I attended what was probably one my favorite sessions of the conference. It was an all-day workshop on building storyboards to help design dashboards that deliver actionable insights. We were split into teams from the outset, and we worked on how to create a four-part storyboard that focused on:
- Identifying the top goal
- Creating the targets that tell us if we are on the right track
- Building the trends that describe our targets
- Describing the actions that are needed to maintain or exceed the targets
This was one of those sessions where you really had to be there in person to get the full benefit of the workshop and I wouldn’t do it justice trying to explain it here. Instead I’ll share some links to research pages recommended by the speaker, Mico Yuk, and I’ll highlight some of my key takeaways.
- Define your goal clearly and early on
- Build the KPIs that tie to your goal afterwards
- Trends are not always timelines
- Example: What are you spending your money on?
- Every block in your storyboard should be a tile/visual in your dashboard
- You need the measurement of when a goal needs to be attained
- “The reason why visuals are hard is because the measurements are wrong”
- How you label your goals, targets, and KPIs has a direct impact into how they are interpreted/received by their audience
- i.e. Grow vs. Increase – (Increase has no emotional association)
- We have to think as marketers and use the right side of our brain when writing storyboards
- When meeting with users/stakeholders, eliminate these questions:
- What do you want to see?
- How do you want it to look?
- Instead, replace them with: What story are you trying to tell with this data?
- Leverage storyboards to help you get from Information to Knowledge
- Do not build a storyboard based on what data is available
- Instead, focus on the story and then go back to the user to explain what they can focus on
- Interpretation is key. We need to ensure that the message we are trying to convey is what the end user is interpreting.
And finally, a couple of slides which I thought highlighted the key message from the session:
Let me know your thoughts or feedback on the pre-conference sessions in the comments section, and stay tuned for the next post!