Python Regular Expressions Cheat Sheet

The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it's nice to have a handy reference, so we've put together this cheat sheet to help you out!

This cheat sheet is based on Python 3’s documentation on regular expressions. If you're interested in learning Python, we have a free Python Programming: Beginnercourse for you to try out.

Download the cheat sheet here

 

Special Characters

^ | Matches the expression to its right at the start of a string. It matches every such instance before each \n in the string.

$ | Matches the expression to its left at the end of a string. It matches every such instance before each \n in the string.

. | Matches any character except line terminators like \n.

\ | Escapes special characters or denotes character classes.

A|B | Matches expression A or B. If A is matched first, B is left untried.

+ | Greedily matches the expression to its left 1 or more times.

* | Greedily matches the expression to its left 0 or more times.

? | Greedily matches the expression to its left 0 or 1 times. But if ? is added to qualifiers (+, *, and ? itself) it will perform matches in a non-greedy manner.

{m} | Matches the expression to its left m times, and not less.

{m,n} | Matches the expression to its left m to n times, and not less.

{m,n}? | Matches the expression to its left m times, and ignores n. See ? above.

Character Classes (a.k.a. Special Sequences)

\w | Matches alphanumeric characters, which means a-z, A-Z, and 0-9. It also matches the underscore, _.

\d | Matches digits, which means 0-9.

\D | Matches any non-digits.

\s | Matches whitespace characters, which include the \t, \n, \r, and space characters.

\S | Matches non-whitespace characters.

\b | Matches the boundary (or empty string) at the start and end of a word, that is, between \w and \W.

\B | Matches where \b does not, that is, the boundary of \w characters.

\A | Matches the expression to its right at the absolute start of a string whether in single or multi-line mode.

\Z | Matches the expression to its left at the absolute end of a string whether in single or multi-line mode.

Sets

[ ] | Contains a set of characters to match.

[amk] | Matches either a, m, or k. It does not match amk.

[a-z] | Matches any alphabet from a to z.

[a\-z] | Matches a, -, or z. It matches - because \ escapes it.

[a-] | Matches a or -, because - is not being used to indicate a series of characters.

[-a] | As above, matches a or -.

[a-z0-9] | Matches characters from a to z and also from 0 to 9.

[(+*)] | Special characters become literal inside a set, so this matches (, +, *, and ).

[^ab5] | Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5.

Groups

( ) | Matches the expression inside the parentheses and groups it.

(? ) | Inside parentheses like this, ? acts as an extension notation. Its meaning depends on the character immediately to its right.

(?PAB) | Matches the expression AB, and it can be accessed with the group name.

(?aiLmsux) | Here, a, i, L, m, s, u, and x are flags:

  • a — Matches ASCII only
  • i — Ignore case
  • L — Locale dependent
  • m — Multi-line
  • s — Matches all
  • u — Matches unicode
  • x — Verbose

(?:A) | Matches the expression as represented by A, but unlike (?PAB), it cannot be retrieved afterwards.

(?#...) | A comment. Contents are for us to read, not for matching.

A(?=B) | Lookahead assertion. This matches the expression A only if it is followed by B.

A(?!B) | Negative lookahead assertion. This matches the expression A only if it is not followed by B.

(?<=B)A | Positive lookbehind assertion. This matches the expression A only if B is immediately to its left. This can only matched fixed length expressions.

(?<!B)A | Negative lookbehind assertion. This matches the expression A only if B is not immediately to its left. This can only matched fixed length expressions.

(?P=name) | Matches the expression matched by an earlier group named “name”.

(...)\1 | The number 1 corresponds to the first group to be matched. If we want to match more instances of the same expresion, simply use its number instead of writing out the whole expression again. We can use from 1 up to 99 such groups and their corresponding numbers.

Popular Python re module Functions

re.findall(A, B) | Matches all instances of an expression A in a string B and returns them in a list.

re.search(A, B) | Matches the first instance of an expression A in a string B, and returns it as a re match object.

re.split(A, B) | Split a string B into a list using the delimiter A.

re.sub(A, B, C) | Replace A with B in the string C.

Useful Regular Expressions Sites for Python users

Python 3 re module documentation

Online regex tester and debugger

Source: https://www.dataquest.io/blog/regex-cheats...

Using React, Firebase, and Ant Design to Quickly Prototype Web Applications

In this guide I will show you how to use FirebaseReact, and Ant Design as building blocks to build functional, high-fidelity web applications. To illustrate this, we'll go through an example of building a todo list app.

These days, there are so many tools available for web development that it can feel paralyzing. Which server should you use? What front-end framework are you going to choose? Usually, the recommended approach is to use the technologies that you know best. Generally, this means choosing a battle-tested database like PostgreSQL or MySQL, choosing a MVC framework for your webserver (my favourite is Adonis), and either using that framework's rendering engine or using a client-side javascript library like ReactJS or AngularJS.

Using the above approach is productive -- especially if you have good boilerplate to get you started -- but what if you want to build something quickly with nearly zero setup time? Sometimes a mockup doesn't convey enough information to a client; sometimes you want to build out an MVP super fast for a new product.

The source code for this example is available here. If you're looking for a good IDE to use during this guide, I highly recommend Visual Studio Code.

A React Development Environment Using Create React App

React is a javascript library for building user interfaces. The library is "component based" meaning you can create building blocks and compose your interface out these reusable components. Create React App, on the other hand, is a zero-configuration React environment. It works out of the box with one shell command and keeps your environment up to date.

To get started, install Node.js for your system by following the instructions here.

Then start your new Create React App project:

npx create-react-app quick-todo && cd quick-todo

Now, you can run the development webserver with:

npm start

Visit http://localhost:3000/ in your browser and you should see this:

Great! You now have a functional React development environment.

Integrate Firebase with Your Application

Now that you have a React development environment, the next step is to integrate Firebase into your app. Firebase's core product is a real-time database service. This means that your users do not need to refresh a page to see updates to the state of the application and it takes no extra effort on your part to make this happen.

Head over https://firebase.google.com and create an account if you haven't already. Then create a new Firebase project called quick-todo.

Once you have your Firebase project, provision a "Cloud Firestore" database:

Here we're using no authentication on the database because we're building a prototype. When you build a real application, you'll want to create proper security rules but let's not worry about that for now.

Ok, now that your Firebase database is provisioned, let's get it integrated into your React app. In your project directory, run the following command to install the necessary dependencies:

npm i --save firebase @firebase/app @firebase/firestore

Then, in your project, add a new file in the src directory called firestore.js with the following contents:

firestore.js

import firebase from "@firebase/app";
import "@firebase/firestore";

const config = {
  apiKey: "<apiKey>",
  authDomain: "<authDomain>",
  databaseURL: "<databaseURL>",
  projectId: "<projectId>",
  storageBucket: "",
  messagingSenderId: "<messageingSenderId>"
};

const app = firebase.initializeApp(config);
const firestore = firebase.firestore(app);

export default firestore;

Make sure you insert the apiKey and other parameters from your own project. You can find these in your project's settings:

Ok! Now we have access to a real-time Firebase database anywhere in the app by importing our firestore.js utility:

import firestore from "./firestore";

Install Ant Design

Ant Design is a comprehensive design system that includes a full suite of React components. Because React is component-based, it's fairly simple to use Ant Design's React components as building blocks to quickly put together a prototype.

To start using Ant Design's React component system, install antd:

npm i --save antd

Pulling It All Together

We now have all all the tools we need to build our prototype. Let's use our environment to build a high-fidelity prototype of a todo app.

First, let's clean our slate. Modify App.js and App.css so that they look like this:

App.js

import React, { Component } from "react";

import "./App.css";

class App extends Component {
  render() {
    return <div className="App" />;
  }
}

export default App;

App.cs

@import "~antd/dist/antd.css";

.App {
  text-align: center;
}

Notice how we've imported the css for antd.

Now, let's setup some basic structure for our todo app. We can use the antdLayout component for this:

App.js

import React, { Component } from "react";
import { Layout } from "antd";

import "./App.css";

const { Header, Footer, Content } = Layout;

class App extends Component {
  render() {
    return (
      <Layout className="App">
        <Header className="App-header">
          <h1>Quick Todo</h1>
        </Header>
        <Content className="App-content">Content</Content>
        <Footer className="App-footer">&copy; My Company</Footer>
      </Layout>
    );
  }
}

export default App;

App.css

@import "~antd/dist/antd.css";

.App {
  text-align: center;
}

.App-header h1 {
  color: whitesmoke;
}

.App-content {
  padding-top: 100px;
  padding-bottom: 100px;
}

With these changes made, we can run our development server. You should seed something like this:

Now, we can utilize our firestore.js module that we create earlier to start adding todos to our real-time firebase database. You can read more about how to use Firebase Cloud Firestore here.

Let's walk through the following changes to our source code:

App.js

import React, { Component } from "react";
import { Layout, Input, Button } from "antd";

// We import our firestore module
import firestore from "./firestore";

import "./App.css";

const { Header, Footer, Content } = Layout;

class App extends Component {
  constructor(props) {
    super(props);
    // Set the default state of our application
    this.state = { addingTodo: false, pendingTodo: "" };
    // We want event handlers to share this context
    this.addTodo = this.addTodo.bind(this);
  }

  async addTodo(evt) {
    // Set a flag to indicate loading
    this.setState({ addingTodo: true });
    // Add a new todo from the value of the input
    await firestore.collection("todos").add({
      content: this.state.pendingTodo,
      completed: false
    });
    // Remove the loading flag and clear the input
    this.setState({ addingTodo: false, pendingTodo: "" });
  }

  render() {
    return (
      <Layout className="App">
        <Header className="App-header">
          <h1>Quick Todo</h1>
        </Header>
        <Content className="App-content">
          <Input
            ref="add-todo-input"
            className="App-add-todo-input"
            size="large"
            placeholder="What needs to be done?"
            disabled={this.state.addingTodo}
            onChange={evt => this.setState({ pendingTodo: evt.target.value })}
            value={this.state.pendingTodo}
            onPressEnter={this.addTodo}
          />
          <Button
            className="App-add-todo-button"
            size="large"
            type="primary"
            onClick={this.addTodo}
            loading={this.state.addingTodo}
          >
            Add Todo
          </Button>
        </Content>
        <Footer className="App-footer">&copy; My Company</Footer>
      </Layout>
    );
  }
}

export default App;

App.css

@import "~antd/dist/antd.css";

.App {
  text-align: center;
}

.App-header h1 {
  color: whitesmoke;
}

.App-content {
  padding-top: 100px;
  padding-bottom: 100px;
}

.App-add-todo-input {
  max-width: 300px;
  margin-right: 5px;
}

.App-add-todo-button {
}

With these changes, you can see that we now have an input on our application to add new todos.

Adding todos doesn't yet show up in the UI, but you can browse your Firebase database to see any todos that you add!

The last step to having a fully functional todo app is to show the list of todos and allow the user to complete them. To do this, we can use the Listcomponent from Ant Design to show incomplete todos. Take the following source code for example:

App.js

import React, { Component } from "react";
import { Layout, Input, Button, List, Icon } from "antd";

// We import our firestore module
import firestore from "./firestore";

import "./App.css";

const { Header, Footer, Content } = Layout;

class App extends Component {
  constructor(props) {
    super(props);
    // Set the default state of our application
    this.state = { addingTodo: false, pendingTodo: "", todos: [] };
    // We want event handlers to share this context
    this.addTodo = this.addTodo.bind(this);
    this.completeTodo = this.completeTodo.bind(this);
    // We listen for live changes to our todos collection in Firebase
    firestore.collection("todos").onSnapshot(snapshot => {
      let todos = [];
      snapshot.forEach(doc => {
        const todo = doc.data();
        todo.id = doc.id;
        if (!todo.completed) todos.push(todo);
      });
      // Sort our todos based on time added
      todos.sort(function(a, b) {
        return (
          new Date(a.createdAt).getTime() - new Date(b.createdAt).getTime()
        );
      });
      // Anytime the state of our database changes, we update state
      this.setState({ todos });
    });
  }

  async completeTodo(id) {
    // Mark the todo as completed
    await firestore
      .collection("todos")
      .doc(id)
      .set({
        completed: true
      });
  }

  async addTodo() {
    if (!this.state.pendingTodo) return;
    // Set a flag to indicate loading
    this.setState({ addingTodo: true });
    // Add a new todo from the value of the input
    await firestore.collection("todos").add({
      content: this.state.pendingTodo,
      completed: false,
      createdAt: new Date().toISOString()
    });
    // Remove the loading flag and clear the input
    this.setState({ addingTodo: false, pendingTodo: "" });
  }

  render() {
    return (
      <Layout className="App">
        <Header className="App-header">
          <h1>Quick Todo</h1>
        </Header>
        <Content className="App-content">
          <Input
            ref="add-todo-input"
            className="App-add-todo-input"
            size="large"
            placeholder="What needs to be done?"
            disabled={this.state.addingTodo}
            onChange={evt => this.setState({ pendingTodo: evt.target.value })}
            value={this.state.pendingTodo}
            onPressEnter={this.addTodo}
            required
          />
          <Button
            className="App-add-todo-button"
            size="large"
            type="primary"
            onClick={this.addTodo}
            loading={this.state.addingTodo}
          >
            Add Todo
          </Button>
          <List
            className="App-todos"
            size="large"
            bordered
            dataSource={this.state.todos}
            renderItem={todo => (
              <List.Item>
                {todo.content}
                <Icon
                  onClick={evt => this.completeTodo(todo.id)}
                  className="App-todo-complete"
                  type="check"
                />
              </List.Item>
            )}
          />
        </Content>
        <Footer className="App-footer">&copy; My Company</Footer>
      </Layout>
    );
  }
}

export default App;

App.css

@import "~antd/dist/antd.css";

.App {
  text-align: center;
}

.App-header h1 {
  color: whitesmoke;
}

.App-content {
  padding-top: 100px;
  padding-bottom: 100px;
}

.App-add-todo-input {
  max-width: 300px;
  margin-right: 5px;
}

.App-add-todo-button {
}

.App-todos {
  background-color: white;
  max-width: 400px;
  margin: 0 auto;
  margin-top: 20px;
  margin-bottom: 20px;
}

.App-todo {
  /* position: relative; */
}

.App-todo-complete {
  font-size: 22px;
  font-weight: bold;
  cursor: pointer;
  position: absolute;
  right: 24px;
}

With these final changes, we can see the todos that are added in our application as a list:

And there we have it! Using React, Firebase, and Ant Design, we were able to quickly create a high-fidelity web application. Using these tools can help you create something functional and aesthetically pleasing in no time.

This can be very valuable when you need to demonstrate functionality of an app to someone without spending too much time building it.

This guide focuses on using tools to quickly build prototypes but I think this approach can also be used to create production-ready web apps. Ant Design can be themed and Firebase is extremely scalable.

The only question of using Firebase over a traditional webserver is cost. For applications with many users, Firebase may get expensive quickly; however, using the traditional approach of webserver and database can also be costly to host. Additionally, you also need to take into account the time and cost of building, configuring, and managing your webserver and database!

Originally published at nrempel.com

Source: https://dev.to/nbrempel/using-react-fireba...

Data science with Python: Turn your conditional loops to Numpy vectors

Vectorization trick is fairly well-known to data scientists and is used routinely in coding, to speed up the overall data transformation, where simple mathematical transformations are performed over an iterable object e.g. a list. What is less appreciated is that it even pays to vectorize non-trivial code blocks such as conditional loops.

Python is fast emerging as the de-facto programming language of choice for data scientists. But unlike R or Julia, it is a general purpose language and does not have a functional syntax to start analyzing and transforming numerical data right out of the box. So, it needs specialized library.

Numpy , short for Numerical Python, is the fundamental package required for high performance scientific computing and data analysis in Python ecosystem. It is the foundation on which nearly all of the higher-level tools such as Pandas and scikit-learn are built. TensorFlow uses NumPy arrays as the fundamental building block on top of which they built their Tensor objects and graphflow for deep learning tasks (which makes heavy use of linear algebra operations on a long list/vector/matrix of numbers).

Many Numpy operations are implemented in C, avoiding the general cost of loops in Python, pointer indirection and per-element dynamic type checking. The speed boost depends on which operations you’re performing. For data science and modern machine learning tasks, this is an invaluable advantage.

My recent story about demonstrating the advantage of Numpy-based vectorization of simple data transformation task caught some fancy and was well received by readers. There was some interesting discussion on the utility of vectorization over code simplicity and such.

Now, mathematical transformation based on some predefined condition are fairly common in data science tasks. And it turns out one can easily vectorize simple blocks of conditional loops by first turning them into functions and then using numpy.vectorize method. In my previous article I showed an order of magnitude speed boost for numpy vectorization of simple mathematical transformation. For the present case, the speedup is less dramatic, as the internal conditional looping is still somewhat inefficient. However, there is at least 20–50% improvement in the execution time over other plain vanilla Python codes.

Here is the simple code to demonstrate it:

import numpy as np
from math import sin as sn
import matplotlib.pyplot as plt
import time

# Number of test points
N_point = 1000

# Define a custom function with some if-else loops
def myfunc(x,y): 
  if (x>0.5*y and y<0.3): return (sn(x-y)) 
  elif (x<0.5*y): return 0 
  elif (x>0.2*y): return (2*sn(x+2*y)) 
  else: return (sn(y+x))

# List of stored elements, generated from a Normal distribution
lst_x = np.random.randn(N_point)
lst_y = np.random.randn(N_point)
lst_result = []

# Optional plots of the data
plt.hist(lst_x,bins=20)
plt.show()
plt.hist(lst_y,bins=20)
plt.show()

# First, plain vanilla for-loop
t1=time.time()
First, plain vanilla for-loop
t1=time.time()
for i in range(len(lst_x)):
    x = lst_x[i]
    y= lst_y[i]
    if (x>0.5*y and y<0.3):
        lst_result.append(sn(x-y))
    elif (x<0.5*y):
        lst_result.append(0)
    elif (x>0.2*y):
        lst_result.append(2*sn(x+2*y))
    else:
        lst_result.append(sn(y+x))
t2=time.time()

print("\nTime taken by the plain vanilla for-loop\n----------------------------------------------\n{} us".format(1000000*(t2-t1)))

# List comprehension
print("\nTime taken by list comprehension and zip\n"+'-'*40)
%timeit lst_result = [myfunc(x,y) for x,y in zip(lst_x,lst_y)]

# Map() function
print("\nTime taken by map function\n"+'-'*40)
%timeit list(map(myfunc,lst_x,lst_y))

# Numpy.vectorize method
print("\nTime taken by numpy.vectorize method\n"+'-'*40)
vectfunc = np.vectorize(myfunc,otypes=[np.float],cache=False)
%timeit list(vectfunc(lst_x,lst_y))

# Results
Time taken by the plain vanilla for-loop
----------------------------------------------
2000.0934600830078 us

Time taken by list comprehension and zip
----------------------------------------
1000 loops, best of 3: 810 µs per loop

Time taken by map function
----------------------------------------
1000 loops, best of 3: 726 µs per loop

Time taken by numpy.vectorize method
----------------------------------------
1000 loops, best of 3: 516 µs per loop

Notice that I have used %timeit Jupyter magic command everywhere I could write the evaluated expression in one line. That way I am effectively running at least 1000 loops of the same expression and averaging the execution time to avoid any random effect. Consequently, if you run this whole script in a Jupyter notebook, you may slightly different result for the first case i.e. plain vanilla for-loop execution, but the next three should give very consistent trend (based on your computer hardware).

We see the evidence that, for this data transformation task based on a series of conditional checks, the vectorization approach using numpy routinely gives some 20–50% speedup compared to general Python methods.

It may not seem a dramatic improvement, but every bit of time saving adds up in a data science pipeline and pays back in the long run! If a data science job requires this transformation to happen a million times, that may result in a difference between 2 days and 8 hours.

In short, wherever you have a long list of data and need to perform some mathematical transformation over them, strongly consider turning those python data structures (list or tuples or dictionaries) into numpy.ndarray objects and using inherent vectorization capabilities.

Numpy provides a C-API for even faster code execution but it takes away the simplicity of Python programming. This Scipy lecture note shows all the related options you have in this regard.

There is an entire open-source, online book on this topic by a French neuroscience researcher. Check it out here.

Source: https://www.codementor.io/tirthajyotisarka...