Upload solutions

mfherbst · Jul 23, 2021 · 3200e2b · 3200e2b
1 parent ee1d3fe
commit 3200e2b
Show file tree

Hide file tree

Showing 22 changed files with 542 additions and 50 deletions.
diff --git a/0_Getting_started.ipynb b/0_Getting_started.ipynb
@@ -64,7 +64,7 @@
     "    \n",
     "    \n",
     "* So who are you?\n",
-    "    - Quick poll: http://etc.ch/rWup"
+    "    - Quick poll: INSERT URL HERE"
    ]
   },
   {
@@ -82,7 +82,7 @@
     "- I have prepared:\n",
     "    * Some theory\n",
     "    * Many examples\n",
-    "    * Short exercises \n",
+    "    * Short exercises (with [solutions](solutions))\n",
     "    \n",
     "    \n",
     "- This is meant to be interactive:\n",

diff --git a/1_Installation.ipynb b/1_Installation.ipynb
@@ -19,7 +19,10 @@
     "  * Note: Our test suite segfaults in 1.7 (see this PR: https://github.com/JuliaLang/julia/pull/41516)\n",
     "- Some working python setup\n",
     "\n",
-    "These notes have been made targeting DFTK 0.3.9."
+    "\n",
+    "- These notes have been made targeting **DFTK 0.3.9**.<br />\n",
+    "  Some interface additions and bug fixes of this version are required for these notes,\n",
+    "  so please ensure you update from a previous version of the code."
    ]
   },
   {

diff --git a/3_Density_functional_theory.ipynb b/3_Density_functional_theory.ipynb
@@ -83,7 +83,7 @@
     "    + \\int V_\\text{ext} \\rho + \\int V_\\text{H}[\\rho] \\rho + \\int V_\\text{xc}[\\rho] \\rho + E_\\text{nuclear}$$\n",
     "    \n",
     "  with\n",
-    "     * the electron **density** $\\rho = \\sum_i^N |\\psi_i|^2$\n",
+    "     * the electron **density** $\\rho = \\sum_i^N 2 |\\psi_i|^2$\n",
     "       being directly dependent on *all* orbitals.\n",
     "     * $\\sum_i 2 \\int \\psi_i^\\ast \\left(-\\frac12 \\Delta\\right) \\psi_i$ describing\n",
     "       the **kinetic** energy of the electrons\n",
@@ -172,7 +172,7 @@
     "To allow for writing the SCF problem more compact\n",
     "we define the **potential-to-density map**\n",
     "$$\n",
-    "D(V) = \\sum_{i=1}^N |\\psi_i|^2 \\qquad \\text{$\\psi_i$ are the $N$ lowest eigenvectors of $-\\frac12 \\Delta + V$}.\n",
+    "D(V) = \\sum_{i=1}^N 2 |\\psi_i|^2 \\qquad \\text{$\\psi_i$ are the $N$ lowest eigenvectors of $-\\frac12 \\Delta + V$}.\n",
     "$$\n",
     "With it the SCF problem can be written as\n",
     "$$ \\rho = D(V(\\rho)). $$\n",
@@ -403,7 +403,7 @@
    "source": [
     "## Modelling aluminum\n",
     "\n",
-    "**Exercise:** Try running a simulation of aluminium yourself.\n",
+    "**Exercise 1:** Try running a simulation of aluminium yourself.\n",
     "\n",
     "For aluminum a possible structural setup is\n",
     "```julia\n",

diff --git a/4_Solving_the_SCF_problem.ipynb b/4_Solving_the_SCF_problem.ipynb
@@ -108,7 +108,7 @@
     "callback = DFTK.ScfDefaultCallback() ∘ plot_callback\n",
     "    \n",
     "# Run the SCF and show the plot\n",
-    "scfres = self_consistent_field(aluminium_setup(5); tol=1e-12, callback=callback);\n",
+    "scfres = self_consistent_field(aluminium_setup(); tol=1e-12, callback=callback);\n",
     "p"
    ]
   },
@@ -117,7 +117,7 @@
    "id": "furnished-congress",
    "metadata": {},
    "source": [
-    "**Exercise:** Try making this problem harder by running on `aluminium_setup(2)` and `aluminium_setup(5)` (or higher if you can efford it). What do you observe in the plot?"
+    "**Exercise 1:** Try making this problem harder by running on `aluminium_setup(2)` and `aluminium_setup(5)` (or higher if you can efford it). What do you observe in the plot?"
    ]
   },
   {
@@ -222,7 +222,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "radio-phase",
+   "id": "composite-springfield",
    "metadata": {},
    "source": [
     "## Step 2: Damped iterations\n",
@@ -231,8 +231,24 @@
     "$$ \\rho_{n+1} = \\rho_{n} + \\alpha (F(\\rho_n) - \\rho_n) $$\n",
     "In other words the update $F(\\rho_n) - \\rho_n$ proposed in the $n$-th SCF step is not fully taken, but scaled-down by the damping $\\alpha$.\n",
     "\n",
-    "**Exercise:** Modify `fixed_point_iteration` such that it supports this *damped* fixed-point iteration. Try different values for $\\alpha$ between $0$ and $1$ and estimate roughly the $\\alpha$ which gives fastest convergence. For which $\\alpha$ do you observe no convergence at all?\n",
-    "\n",
+    "**Exercise 2:** Modify `fixed_point_iteration` such that it supports this *damped* fixed-point iteration. Try different values for $\\alpha$ between $0$ and $1$ and estimate roughly the $\\alpha$ which gives fastest convergence. For which $\\alpha$ do you observe no convergence at all?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "arabic-haiti",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Your solution here ..."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "radio-phase",
+   "metadata": {},
+   "source": [
     "**Remark:** The observations you make here are general. We will argue in the next notebook why every SCF converges (locally) if a small enough $\\alpha > 0$ is chosen."
    ]
   },
@@ -246,7 +262,7 @@
     "The `fixed_point_iteration` function above (with the damping extension) already covers the main gist of standard DFT algorithms. To make things converge faster the next step to follow is Anderson acceleration, where not only $\\rho_n$ and $F(\\rho_n)$, but also older iterates are used to propose the next density.\n",
     "\n",
     "For Anderson one exploits that the update $R(\\rho) = F(\\rho) - \\rho$ is also the residual of the fixed-point problem $F(\\rho) = \\rho$, i.e. how far away we are from the fixed-point density. A good next density $\\rho_{n+1}$ therefore should be found by minimising an approximation for $R(\\rho_{n+1})$. Assuming the SCF was linear in the density (which it is not), a good idea is to find a linear combination of residuals\n",
-    "$$\\sum_i \\beta_i R(\\rho_i)$$\n",
+    "$$\\min_{\\beta_i} \\left\\| \\sum_i \\beta_i R(\\rho_i) \\right\\|^2$$\n",
     "which has the smallest possible norm and to use these coefficients $\\beta_i$ to extrapolate the next\n",
     "density\n",
     "$$ \\rho_{n+1} =  \\sum_i \\beta_i (\\rho_i + \\alpha R(\\rho_i)) $$\n",
@@ -279,8 +295,13 @@
     "        converged && break\n",
     "        \n",
     "        ρnext = vec(ρ) .+ vec(Rρ)\n",
-    "        if !isempty(Rs)           \n",
-    "            # will be developed in the workshop ...\n",
+    "        if !isempty(Rs)\n",
+    "            M = hcat(Rs...) .- vec(Rρ)\n",
+    "            βs = -(M \\ vec(Rρ))\n",
+    "            \n",
+    "            for (iβ, β) in enumerate(βs)\n",
+    "                ρnext .+= β .* (ρs[iβ] .- vec(ρ) .+ Rs[iβ] .- vec(Rρ))\n",
+    "            end\n",
     "        end\n",
     "                    \n",
     "        push!(ρs, vec(ρ))\n",
@@ -308,17 +329,15 @@
     "```\n",
     "to choose a damping of $\\alpha = 0.8$ and run for at most `maxiter` iterations.\n",
     "\n",
-    "**Exercise:** Pick $\\alpha = 0.8$ and make the problem harder by increasing `repeat` (e.g. `2`, `4`, `6`, `8`). Can you make Anderson fail to converge? What do you notice in terms of the number of iterations and runtimes?"
+    "**Exercise 3:** Pick $\\alpha = 0.8$ and make the problem harder by increasing `repeat` (e.g. `2`, `4`, `6`, `8`). Can you make Anderson fail to converge? What do you notice in terms of the number of iterations and runtimes?"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "dramatic-queens",
+   "cell_type": "markdown",
+   "id": "modular-delta",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Your solution here"
+    "**Remark:** Anderson acceleration comes in many names and variants. It is sometimes also called Pulay mixing or DIIS (direct inversion of the iterative subspace). Closely related are also Broyden's methods. For more details on the relationship of these methods see this [review on SCF methods](http://doi.org/10.1088/1361-648x/ab31c0) as well as this [paper on the convergence analysis of DIIS methods](https://arxiv.org/abs/2002.12850)."
    ]
   },
   {
@@ -386,11 +405,11 @@
     "\n",
     "To use `KerkerMixing` with DFTK run the SCF as follows\n",
     "```julia\n",
-    "self_consistent_field(basis; α=0.8, mixing=KerkerMixing());\n",
+    "self_consistent_field(basis; damping=0.8, mixing=KerkerMixing());\n",
     "```\n",
     "\n",
     "\n",
-    "**Exercise:** Try this setup for different values of `repeat` and check the number of iterations needed. Other mixings DFTK has to offer are `DielectricMixing` (best for semiconductors) or `LdosMixing` (best for metal-insulator-mixtures). Try them as well and summarise your findings.\n"
+    "**Exercise 4:** Try this setup for different values of `repeat` and check the number of iterations needed. Other mixings DFTK has to offer are `DielectricMixing` (best for semiconductors) or `LdosMixing` (self-adapting, suitable for both metals *or* insulators *or* inhomogeneous mixtures). Try them as well and summarise your findings.\n"
    ]
   },
   {
@@ -403,7 +422,11 @@
     "#### Takeaways:\n",
     "- Large systems require a matching preconditioner to converge in few SCF iterations\n",
     "- Anderson acceleration and/or small damping aids convergence\n",
-    "- Provided Anderson is used one often gets away using a non-matching preconditioner"
+    "- Provided Anderson is used one often gets away using a non-matching preconditioner\n",
+    "\n",
+    "\n",
+    "- The callback infrastructure of DFTK's SCF allows to to modify a few more aspects of the iteration\n",
+    "  very easily. See [this example](https://docs.dftk.org/stable/examples/custom_solvers/) in the documentation for details."
    ]
   }
  ],

diff --git a/5_Analysing_SCF_convergence.ipynb b/5_Analysing_SCF_convergence.ipynb
@@ -16,14 +16,17 @@
     "\n",
     "Near the fixed point $\\rho_\\ast = D(V(\\rho_\\ast))$ the error $e_n = \\rho_n - \\rho_\\ast$ is small and we can expand to first order:\n",
     "$$ \\begin{align*}\n",
-    "D(V(\\rho_\\ast + e_n)) &\\simeq D[V(\\rho_\\ast) + V'(e_n)] \\\\\n",
-    "&\\simeq D(V(\\rho_\\ast)) + D'(V'(e_n)))\\\\\n",
-    "&= \\rho_\\ast + D'(V'(e_n)))\n",
+    "D(V(\\rho_\\ast + e_n)) &\\simeq D\\left[V(\\rho_\\ast) + V'|_{\\rho_\\ast} e_n\\right] \\\\\n",
+    "&\\simeq D(V(\\rho_\\ast)) + D'|_{V(\\rho_\\ast)} V'|_{\\rho_\\ast} e_n\\\\\n",
+    "&= \\rho_\\ast + D'|_{V(\\rho_\\ast)} V'|_{\\rho_\\ast} e_n\n",
     "\\end{align*}$$\n",
     "\n",
     "The derivatives $D'$ and $V'$ are again important quantities and are given special symbols:\n",
     "- Hartree-exchange-correlation **kernel** $K_\\text{Hxc} = V'$\n",
-    "- Independent-particle **susceptibility** $\\chi_0 = D'$"
+    "- Independent-particle **susceptibility** $\\chi_0 = D'$\n",
+    "\n",
+    "where for simplicity it has been dropped that these quantities are evaluated at the fixed-point,\n",
+    " i.e. at $\\rho_\\ast$ and $V(\\rho_\\ast)$, respectively."
    ]
   },
   {
@@ -323,7 +326,7 @@
    "id": "exclusive-exclusive",
    "metadata": {},
    "source": [
-    "**Exercise:** Let's see what the Kerker preconditioner can do when it comes to charge sloshing.\n",
+    "**Exercise 1:** Let's see what the Kerker preconditioner can do when it comes to charge sloshing.\n",
     "\n",
     "Find the largest eigenvalue for the Aluminium SCF in case the Kerker preconditioner is used.\n",
     "*Hint:* You can construct the operator $P^{-1} \\varepsilon^\\dagger$ by simply chaining the functions (`Pinv_Kerker ∘ epsilon`). Assuming that the smallest eigenvalue is about $0.8$, what is the condition number now? Feel free to take a look at the shape of the largest eigenvalue. What do you notice?\n",
@@ -333,6 +336,16 @@
     "Keeping in mind that the condition number is linked to the convergence speed: Which is setup should be employed to keep the number of required SCF iterations independent of system size."
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "palestinian-energy",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Your solution here ..."
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "regulation-migration",
@@ -397,7 +410,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "scfres = self_consistent_field(helium_setup(40); tol=1e-12, mixing=KerkerMixing());"
+    "scfres = self_consistent_field(helium_setup(30); tol=1e-12, mixing=KerkerMixing());"
    ]
   },
   {
@@ -407,15 +420,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "scfres = self_consistent_field(helium_setup(40); tol=1e-12);"
+    "scfres = self_consistent_field(helium_setup(30); tol=1e-12);"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "straight-attempt",
    "metadata": {},
    "source": [
-    "**Exercise** This can be confirmed by investigating the eigenvalues. Here are some good settings for you to play on this problem. Find the condition numbers with and without `KerkerMixing` and explain the observations."
+    "**Exercise 2:** This can be confirmed by investigating the eigenvalues. Here are some good settings for you to play on this problem. Find the condition numbers with and without `KerkerMixing` and explain the observations in the SCFs on the Helium system we just ran."
    ]
   },
   {
@@ -427,7 +440,7 @@
    "source": [
     "using KrylovKit\n",
     "\n",
-    "scfres = self_consistent_field(helium_setup(40); tol=1e-12);\n",
+    "scfres = self_consistent_field(helium_setup(30); tol=1e-12);\n",
     "epsilon, Pinv_Kerker = construct_Pinv_epsilon(scfres)\n",
     "\n",
     "operator = epsilon\n",

diff --git a/6_Floating_point_error.ipynb b/6_Floating_point_error.ipynb
@@ -31,7 +31,7 @@
     "in a procedure is to solve the problem using higher precision\n",
     "floating-point types and compare the matching digits.\n",
     "\n",
-    "To make things a little more interesting we will use 32bit floating-point arithmetic:"
+    "To make things a little more interesting we will use 32bit floating-point arithmetic and we will try to converge the energy to the rarely required accuracy of `1e-16`."
    ]
   },
   {
@@ -52,8 +52,8 @@
     "# DFTK will use the floating-point type used to represent the lattice\n",
     "# to deduce the floating-point type for the computation\n",
     "model = model_DFT(Array{Float32}(lattice), atoms, [:lda_x, :lda_c_vwn])\n",
-    "basis = PlaneWaveBasis(model; Ecut=7, kgrid=[4, 4, 4], fft_size=(16, 16, 16))\n",
-    "scfres = self_consistent_field(basis, tol=1e-10)\n",
+    "basis = PlaneWaveBasis(model; Ecut=7, kgrid=[1, 1, 1], fft_size=(16, 16, 16))\n",
+    "scfres = self_consistent_field(basis, tol=1e-16, maxiter=40)\n",
     "\n",
     "results = Dict{DataType, Any}(\n",
     "    Float32 => scfres.energies.total\n",
@@ -67,9 +67,8 @@
    "id": "direct-idaho",
    "metadata": {},
    "source": [
-    "**Exercise:** To get a rough idea how many of these energy digits can be trusted,\n",
-    "re-run the computation using `Float64` and `Double64` (a floating point type offering even higher accuracy that 64bits). For the `Double64` case also converge the SCF to higher precision (e.g. `tol=1e-16`).\n",
-    "How many digits of the energy can be trusted at `Float32` and `Float64` level?"
+    "**Exercise 1:** To get a rough idea how many of these energy digits can be trusted,\n",
+    "re-run the computation using `Float64` and `Double64` (a floating point type offering even higher accuracy that 64bits). How many digits of the energy can be trusted at `Float32` and `Float64` level?"
    ]
   },
   {
@@ -83,11 +82,18 @@
     "using DoubleFloats  # Defines Double64\n",
     "\n",
     "# Your solution here\n",
-    "\n",
     "results[Float64]  = zero(Float64)\n",
     "results[Double64] = zero(Double64)\n",
     "\n",
-    "@show results"
+    "println()\n",
+    "println(\"Float32:   $(results[Float32])\")\n",
+    "println(\"Float64:   $(results[Float64])\")\n",
+    "println(\"Double64:  $(results[Double64])\")\n",
+    "\n",
+    "println()\n",
+    "println(\"Errors versus Double64:\")\n",
+    "println(\"Float32:   $(results[Float32] - results[Double64])\")\n",
+    "println(\"Float64:   $(results[Float64] - results[Double64])\")"
    ]
   },
   {
@@ -122,8 +128,8 @@
     "atoms = [Si => [ones(3)/8, -ones(3)/8]]\n",
     "\n",
     "model = model_DFT(Array{Float32}(lattice), atoms, [:lda_x, :lda_c_vwn], symmetries=false)\n",
-    "basis = PlaneWaveBasis(model; Ecut=7, kgrid=[4, 4, 4], fft_size=(16, 16, 16))\n",
-    "scfres = self_consistent_field(basis, tol=1e-10);"
+    "basis = PlaneWaveBasis(model; Ecut=7, kgrid=[1, 1, 1], fft_size=(16, 16, 16))\n",
+    "scfres = self_consistent_field(basis, tol=1e-16, maxiter=40);"
    ]
   },
   {
@@ -162,7 +168,7 @@
     "\n",
     "# Get interval equivalents of key quantities\n",
     "intModel = model_DFT(Array{Interval{Float32}}(lattice), atoms, [:lda_x, :lda_c_vwn], symmetries=false)\n",
-    "intBasis = PlaneWaveBasis(intModel; Ecut=7, kgrid=[4, 4, 4], fft_size=(16, 16, 16))\n",
+    "intBasis = PlaneWaveBasis(intModel; Ecut=7, kgrid=[1, 1, 1], fft_size=(16, 16, 16))\n",
     "intOccupation = [Interval.(occk) for occk in scfres.occupation]\n",
     "intEigvals = [Interval.(λk) for λk in scfres.eigenvalues]\n",
     "intEigvecs = [Interval.(ψk) for ψk in scfres.ψ]\n",
@@ -198,10 +204,14 @@
    "metadata": {},
    "source": [
     "Clearly this calculation looses quite a bit of precision ...\n",
-    "Unfortunately IntervalArithmetic only guarantees that the first digit of the energy\n",
-    "can be trusted. Moreover it is not at all guaranteed that calculation has even converged!\n",
+    "Unfortunately IntervalArithmetic only guarantees that one or not even a single digit of the energy\n",
+    "can be trusted (depends a bit on the way the iteration progresses).\n",
+    "Moreover it is not at all guaranteed that calculation has even converged!\n",
+    "\n",
+    "There are two things one should notice with respect to this result:\n",
+    "   - Interval arithmetic in general overestimates the floating-point error. In our experiments we saw that about 3-4 digits of the energy agree with the higher-precision data types, which is in fact more the order one would expect from practial calculations at single-precision level.\n",
+    "   - Nevertheless interval arithmetic is a great tool to identify places where precision is potentially lost. For example in this case one can already see that the density computation looses about 3-4 signifficant digits. If we improve upon this part of the code (e.g. compute the density in `Float64`), then we still have a reasonable number of trustworthy digits in the energy (about 3-4).\n",
     "\n",
-    "However, one should notice that interval arithmetic in general overestimates the floating-point error. For example in this case one can already see that the density computation already looses about 3-4 significant digits. If we improve upon this (e.g. compute the density in `Float64`), then we still have a reasonable number of trustworthy digits in the energy (about 3-4).\n",
     "\n",
     "#### Takeaway\n",
     "- Trustworthy DFT calculations in pure `Float32` are tricky.\n",

diff --git a/7_Properties_automatic_differentation.ipynb b/7_Properties_automatic_differentation.ipynb
@@ -189,7 +189,7 @@
    "id": "middle-ownership",
    "metadata": {},
    "source": [
-    "**Exercise:** Use the following code fragment to implement stresses using `ForwardDiff`:"
+    "**Exercise 1:** Use the following code fragment to implement stresses using `ForwardDiff`:"
    ]
   },
   {
@@ -199,8 +199,9 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "using FiniteDiff\n",
-    "scfres = compute_silicon(ε)\n",
+    "using ForwardDiff\n",
+    "using DFTK\n",
+    "scfres = compute_silicon(zeros(3))\n",
     "\n",
     "function recompute_silicon_energy_stresses(lattice)\n",
     "    atoms = scfres.basis.model.atoms\n",