{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Quick Start"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instantiate a NumPy array, which holds the ground set $V$: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import datetime\n",
    "\n",
    "np.random.seed(42)\n",
    "def timed(f, *args):\n",
    "    tStart = datetime.datetime.now()\n",
    "    res = f(*args)\n",
    "    tEnd = datetime.datetime.now()\n",
    "    return res, (tEnd - tStart).total_seconds()\n",
    "\n",
    "N = 25000\n",
    "d = 50\n",
    "V = np.random.random((N, d))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now select a random subset $S \\subset V$ and evaluate its function value on the GPU using the `ExemplarClustering` class provided by the `exemcl` package:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from exemcl import ExemplarClustering\n",
    "\n",
    "S = np.take(V, np.random.choice(V.shape[0], size=5000, replace=False), axis=0)\n",
    "exem = ExemplarClustering(ground_set=V, device=\"gpu\")\n",
    "fvalue = exem(S)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Change precision"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is also possible to change the required floating point precision to either half, single or double precision by adding the `precision` parameter during construction and specifying `fp16`, `fp32` or `fp64`, respectively:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Function value (fp16): 13.42227840423584 (took 0.166322s).\n",
      "Function value (fp32): 13.4232816696167 (took 0.179814s).\n",
      "Function value (fp64): 13.423275558060581 (took 3.023624s).\n"
     ]
    }
   ],
   "source": [
    "for fp in [\"fp16\", \"fp32\", \"fp64\"]:\n",
    "    exem = ExemplarClustering(ground_set=V, device=\"gpu\", precision=fp)\n",
    "    fvalue, secs = timed(exem.__call__, S)\n",
    "    print(f\"Function value ({fp}): {fvalue} (took {secs}s).\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multi-set evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you are using Exemplar-based clustering as target function for optimization, e.g. using the Greedy routine, you might want to evaluate more than one set per (optimization) step. We will now create a *list* of subsets, which we will evaluate for their function value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100 function values found (took 10.662428s).\n"
     ]
    }
   ],
   "source": [
    "# Sample a set of 100 sets with 5000 vectors each, which should be evaluated for their function value.\n",
    "S_multi = [np.take(V, np.random.choice(V.shape[0], size=5000, replace=False), axis=0) for _ in range(100)]\n",
    "exem = ExemplarClustering(ground_set=V)\n",
    "fvalues, secs = timed(exem.__call__, S_multi)\n",
    "print(f\"{len(fvalues)} function values found (took {secs}s).\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, you might have some fixed set $S$ and you are looking for marginal gains resulting from marginal elements $E = \\left\\lbrace e_1, ..., e_n \\right\\rbrace$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100 marginal function values found (took 10.44639s).\n"
     ]
    }
   ],
   "source": [
    "# Sample a set of 100 vectors, which with S should be evaluated for their respective marginal function values.\n",
    "e_multi = [np.take(V, np.random.choice(V.shape[0], size=1, replace=False), axis=0).flatten() for _ in range(100)]\n",
    "exem = ExemplarClustering(ground_set=V)\n",
    "marginals, secs = timed(exem.__call__, S, e_multi)\n",
    "print(f\"{len(marginals)} marginal function values found (took {secs}s).\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CPU computation "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You might be also interested in using a CPU-only version of this algorithm. In this case you can simply replace `device=gpu` with `device=cpu`. Please keep in mind, that FP16 operation is not available for CPU devices."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Function value (fp32): 13.423282623291016 (took 2.068594s).\n",
      "Function value (fp64): 13.423275558060581 (took 2.436802s).\n"
     ]
    }
   ],
   "source": [
    "for fp in [\"fp32\", \"fp64\"]:\n",
    "    exem = ExemplarClustering(ground_set=V, device=\"cpu\", precision=fp)\n",
    "    fvalue, secs = timed(exem.__call__, S)\n",
    "    print(f\"Function value ({fp}): {fvalue} (took {secs}s).\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}