Shadow edges are "soft" because the light is gradually obscured as you move into the shadow.
This happens only with area lights, not with point lights or directional lights (such as the sun on a clear day).
With area light sources, the shadow between fully bright and fully dark is called the penumbra. The fully dark shadow is called the umbra.
How can we estimate how much light arrives at a point, $x$, on a surface?
For point and directional sources, we can use a shadow ray.
For area sources, we can sample to find how much of the light arrives at $x$.
In the figure below, six shadow rays are sent from $x$ toward randomly-chosen points the source. Only three of them arrive.
Let $n$ rays be sent toward the source. Let $r_1, r_2, \ldots r_k$ be the $k$ rays that arrive. Then the "average" arriving ray is $$\widehat{\: r} = {1 \over k} \sum_{i=1}^k r_i$$
and the intensity of light arriving on that average ray is ${k \over n}$ times the total output of the light.
Then $\widehat{\:r}$ and ${k \over n}$ can be used in the Phong model to illuminate $x$.
We'll assume that the source is triangular. If it's not, then break it into a union of triangles and handle each triangle separately.
The biggest problem is to generate sample points uniformly randomly across the triangle.
Here's a method that uses Barycentric coordinates on a triangle $a \: b \: c$:
And here are 2000 sample points in a triangle:
The problem with this method is that points are not generated uniformly across the triangle. The points are denser toward corner $b$.
This occurred because, although all $\alpha$ value were uniformly distributed in $[0,1]$, the $\beta$ values were not: They are more likely to be small values.
In the figure below, each outlined area has equal chance of being selected because each spans the same range of $\alpha$ values. So each area will contain, on average, the same number of points. The area that spans a smaller range of $\beta$ values will be smaller and therefore have a denser sampling.
With rejection sampling for triangles, parameters are chosen uniformly randomly and are then rejected if the corresponding point is outside the triangle.
This results in a uniform sampling, as shown below:
Always check that your sampling is uniform. If not, try rejection sampling!