Ad

Friday, August 17, 2018

Data Science beyond the basics

Exploratory Data Analysis

  • Histogram plotting, input is a list of distributions we want to plot,  specify bins, can also weigh each sample differently, it doesn't have to be count 1. hist function can return values.  How many items in each bin, and the plot. 
  • It is also important to do feature extraction, simply the data, reduce computational cost, dimensionality reduction before feeding data into a machine learning algorithm. Algorithms will run faster, more efficiently, use less memory space, and even perform better, in some cases. 
  • Anomaly detection, outlier detection to handle or remove outliers and abnormality in the data to help the model generalize better and be a more accurate representation. 

Machine Learning

Machine Learning is emerging as a popular field of data science. It has predictive power, employs applied statistics and pattern recognition technologies.

Machine learning is taking data mining to the next level.

Major machine learning tasks include classification, regression and clustering.

Questions that Business Analysts and Decision Makers are Interested In

  • Who are the best customers? aka Who are the customers with the best Customer Life Value
  • Causal relationship: 
    • Results of recent experiments (More prevalent in Startup Culture)
    • Hypothesis if one segmentation is actually different from another
    • Is the result significant or is it random chance
    • Please note that causal relationship determination requires controlled studies to control for extraneous variables. In many industries, such as biotech, statistical significance is a must, a prerequisite for next step analysis or more business investments. 
    • Demo graphics of customers. Summary statistics, customer segmentation and more. 
    • How to measure profitability and other Key Performance Indicators (KPI)

Statistical Hypothesis Testing

Python for Data Science

  • Use conda command similar to pip for installing and launching packages
  • Anaconda comes with a wonderful Python IDE called Spyder

Scientific Computing using Scipy

  • Scipy.integral.quad using the quad method to compute integral function to compute, lower bound, first bound, a tuple, returns an approximation of the result and how much error

Becoming a Data Engineer

Data engineer takes care of data quality. Provide data fast, reliably. 
For example, data funnel starts when installing a javascript tag, gather user browsing data. End with a Saas that client can visualize the data.
Things that can happen in between data gathering , aggregation, storage and delivery. ContentSquare collects browser data so that grows fast. 70 million web pages per day, 3 terabytes of new data each month, 10**15 peta bytes per year. kafka, spark elastic, scala akka 
https://youtu.be/hFsGKjPVOn8?list=WL

14 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. https://ufc229-mma.icu/

    https://gamefootballl.de/ufc229/

    ReplyDelete
  3. https://ufc229-mma.icu/

    https://gamefootballl.de/ufc229/

    ReplyDelete
  4. The most recent NBA 2K18 refresh has been propelled by the improvement group on Xbox One.
    This new fix isn't yet out on PS4, Nintendo Switch or PC, despite the fact that the 2K Sports group have affirmed soon it dispatches on all stages.
    No set time or date has been given for the full rollout of refresh 1.06, despite the fact that the authority Nba 2k18 Locker codes fix notes have been given.

    These affirm that Trivia is being re-empowered in the 2K Zone in MyCAREER, albeit, just once all consoles have gotten this refresh.
    On the off chance that you have not yet given it a shot, 2K Zone Trivia gives 25 VC to each question you answer effectively.
    Alternate changes are for the most part bug fixes, which will ideally enable fans to maintain a strategic distance from issues associated with shoe outlines and MyCareer ongoing interaction.

    ReplyDelete
  5. The World Series 2018 Live Stream will be the championship series of Major League Baseball's 2018 season.
    The 114th edition of the World Series 2018 Live will be played between the American League champion versus the National League champion and will begin on Tuesday,
    October 23. A potential Game 7 is scheduled to be played on October 31.[2] The Series will be televised nationally by Fox. For the second year in a row,
    the World Series will be sponsored by YouTube TV https://worldserieshints.com/ and officially will be known as the 2018 World Series presented by YouTube TV.[3]

    ReplyDelete
  6. The initial four races of the Breeders' Cup go to entryways on Friday at Del Mar in Southern California. NBCSN (live stream) will cover the occasions live from 5 to 8 p.m. ET. The Juvenile Fillies Turf is set for a 5:25 p.m. begin, with the Las Vegas Dirt Mile at 6:05 p.m., the Juvenile Turf at 6:50 p.m., and the Longines Distaff at 7:35 p.m. ET.

    All are Grade 1 stakes, and the consolidated handbag for the four is $5 million.

    The greatest race of the day, the Longines Distaff, is a $2 million race run 1/8 miles on soil and open to female ponies ages 3 and up. It was won by Beholder a year ago, however she is currently resigned. The most loved is Stellar Wind, at 5-2 morning line chances to win her last planned race. She completed fourth in the race a year ago. Gladden (3-1) and Forever Unbridled (4-1) round up the best three.Breeders Cup 2018 Live Stream

    ReplyDelete
  7. the start time should be used a guide so to speak. Should there be any puts off earlier in the day, the time may be adjusted as necessities be.

    Avilius is the hot most wanted to take this one, paying $2.25 for the win. Yogi ($8.50) and Our Venice Beach ($9.50) are the fundamental diverse horses coming in at under $10, with Mr Clarify and Wheal Leisure.MORE...Melbourne Cup 2018 Live Stream

    ReplyDelete
  8. At last, we have the Hamilton Tiger-Cats. They landed second in the East in spite of going 8-10. That record's deceptive however – the Ti-Cats scored 513 on the season, surrendering only 456. This group can score; Jeremiah Masoli broke out huge this season. The QB ignored for 5,200 yards with 28 TDs and scrambled for almost 500 and two more scores (however he tieed for the Watch Grey Cup 2018 Live lead in picks with 18). Be that as it may, the Hamilton offense in the playoffs won't be the equivalent, on account of the season-finishing collarbone damage to Brandon Banks, who figured out how to get 94 ignores for 1,400 yards and 11 scores in only 14 recreations. It stays to be checked whether whatever is left of the gathering can lift it up – Luke Tasker (1,104, 78, 11 TDs) is the genuine article, however he can't do only it. Folks like protective back Cariel Brooks (52 handles, four INTs, three constrained bungles) and linebacker Harry Dean (105 handles, two INTs) grapple the resistance.

    ReplyDelete

React UI, UI UX, Reactstrap React Bootstrap

React UI MATERIAL  Install yarn add @material-ui/icons Reactstrap FORMS. Controlled Forms. Uncontrolled Forms.  Columns, grid