Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For evaluation: Implementation of MarkdownDeep.net over Journaley. #129

Merged
merged 6 commits into from
Nov 26, 2015
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Journaley.Core/Journaley.Core.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,9 @@
<ErrorText>This project references NuGet package(s) that are missing on this computer. Enable NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}.</ErrorText>
</PropertyGroup>
<Error Condition="!Exists('..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets')" Text="$([System.String]::Format('$(ErrorText)', '..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets'))" />
<Error Condition="!Exists('$(SolutionDir)\.nuget\NuGet.targets')" Text="$([System.String]::Format('$(ErrorText)', '$(SolutionDir)\.nuget\NuGet.targets'))" />
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a particular reason to change this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know. I think this was when I enabled Nuget Package Recovery in VS.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you shouldn't enable that in the solution. In fact, it turns out these changes (in Journaley.Core, Journaley, VersionExtractor) broke the build on my side. Please remove these lines from all the affected projects.

You should just go to Tools -> Options... -> NuGet Package Manager -> General and check both the Allow NuGet to download missing packages and Automatically check for missing pakcages during build in Visual Studio options. The solution should build fine after enabling these options. If not, let's troubleshoot that together later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reset my repo and I'll do this again. I'll message you if it worked with just removing the code first.

</Target>
<Import Project="$(SolutionDir)\.nuget\NuGet.targets" Condition="Exists('$(SolutionDir)\.nuget\NuGet.targets')" />
<!-- To modify your build process, add your task inside one of the targets below and uncomment it.
Other similar extension points exist, see Microsoft.Common.targets.
<Target Name="BeforeBuild">
Expand Down
56 changes: 11 additions & 45 deletions Journaley/Forms/MainForm.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
namespace Journaley.Forms
{
extern alias MDS;
extern alias MDD;

using System;
using System.Collections.Generic;
Expand All @@ -21,7 +21,7 @@
using Journaley.Core.Utilities;
using Journaley.Core.Watcher;
using Journaley.Utilities;
using MDS.MarkdownSharp;
using MDD.MarkdownDeep;
using Pabo.Calendar;
using Squirrel;

Expand Down Expand Up @@ -475,7 +475,8 @@ private Markdown Markdown
if (this.markdown == null)
{
this.markdown = new Markdown();
((MarkdownOptions)this.markdown.Options).AutoNewLines = true;
this.markdown.ExtraMode = true;
this.markdown.SafeMode = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these two options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the website:

bool SafeMode
Set to true to only allow whitelisted safe html tags

bool ExtraMode
Set to true to enable ExtraMode, which enables the same set of features as implemented by PHP Markdown Extra.

Extramode is needed, becuase most of the implementations in Day One uses Markdown Extra stuff like tables and footnotes

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We might want to turn on the SafeMode option later for resolving #118. Let's keep this false for now.

}

return this.markdown;
Expand Down Expand Up @@ -638,53 +639,18 @@ private void OnGetMinMaxInfo(ref Message m)
/// </summary>
private void UpdateWebBrowser()
{
string initialString = this.SelectedEntry.EntryText;
string docType = "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with adding the doctype declaration, but why not HTML5? Let's just go with:

string docType = "<!DOCTYPE html>";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We consider the ones that have IE 8 or 9 where HTML5 might not work. Remember that the WebBrowser control relies on the Internet Explorer available. But yeah, we could put HTML5, but I'm not sure if it renders to other people's computers nicely.

On second thought, maybe we could use the HTML5 doctype since the other IEs will conform to it anyway.


string formattedString = string.Format(
"<style type=\"text/css\">\n<!-- Font CSS -->\n{0}\n<!-- Size CSS -->\n{1}\n<!-- Custom CSS -->\n{2}\n</style><html><body><div>{3}</div></body></html>",
"{0}\n<style type=\"text/css\">\n<!-- Font CSS -->\n{1}\n<!-- Size CSS -->\n{2}\n<!-- Custom CSS -->\n{3}\n</style>\n<html>\n<body>\n<div>\n{4}\n</div>\n</body>\n</html>",
docType,
this.GetWebBrowserTypefaceCSS(),
this.GetWebBrowserSizeCSS(),
this.CustomCSS ?? string.Empty,
Markdown.Transform(this.SelectedEntry.EntryText));

this.webBrowser.DocumentText = this.RemoveLineBreaksWithinLists(formattedString);
}

/// <summary>
/// Removes the wrong line breaks within nested ordered/unordered lists.
/// Hack to fix the issue #114.
/// </summary>
/// <param name="formattedString">The entry content formatted by MarkdownSharp.</param>
/// <returns>correctly formatted string with the wrong line breaks removed.</returns>
private string RemoveLineBreaksWithinLists(string formattedString)
{
System.Text.StringBuilder builder = new System.Text.StringBuilder();
string pattern = @"^</?(ol|ul|li)>";

var lines = formattedString.Split(
new string[] { "\r\n", "\n" },
StringSplitOptions.None);

for (int i = 0; i < lines.Length - 1; ++i)
{
string line = lines[i];
string nextLine = lines[i + 1];

// Remove the "<br />" tags only if the current line and the next line are starting
// with the opening/closing list tags. By doing this, it can prevent removing
// line breaks that are intentionally added by the user.
if (Regex.IsMatch(line, pattern) && Regex.IsMatch(nextLine, pattern))
{
while (line.EndsWith("<br />"))
{
line = line.Substring(0, line.LastIndexOf("<br />"));
}
}

builder.AppendLine(line);
}

builder.AppendLine(lines.Last());
PostMarkdownParser.PostMarkdown(Markdown.Transform(initialString)));

return builder.ToString();
this.webBrowser.DocumentText = formattedString;
}

/// <summary>
Expand Down
10 changes: 7 additions & 3 deletions Journaley/Journaley.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,10 @@
<SpecificVersion>False</SpecificVersion>
<HintPath>..\packages\squirrel.windows.0.99.2\lib\Net45\ICSharpCode.SharpZipLib.dll</HintPath>
</Reference>
<Reference Include="MarkdownSharp, Version=0.0.0.0, Culture=neutral, processorArchitecture=MSIL">
<HintPath>..\packages\MarkdownSharp.1.13.0.0\lib\35\MarkdownSharp.dll</HintPath>
<Aliases>MDS</Aliases>
<Reference Include="MarkdownDeep, Version=1.5.4615.26275, Culture=neutral, processorArchitecture=MSIL">
<HintPath>..\packages\MarkdownDeep.NET.1.5\lib\.NetFramework 3.5\MarkdownDeep.dll</HintPath>
<Private>True</Private>
<Aliases>MDD</Aliases>
</Reference>
<Reference Include="Microsoft.WindowsAPICodePack">
<HintPath>..\packages\winapicp.1.1\lib\Microsoft.WindowsAPICodePack.dll</HintPath>
Expand Down Expand Up @@ -202,6 +203,7 @@
<DesignTime>True</DesignTime>
<DependentUpon>Resources.resx</DependentUpon>
</Compile>
<Compile Include="Utilities\PostMarkdownParser.cs" />
<Compile Include="Utilities\CursorReader.cs" />
<Compile Include="Utilities\FontReader.cs" />
<Compile Include="Utilities\HtmlToText.cs">
Expand Down Expand Up @@ -499,10 +501,12 @@
<ErrorText>This project references NuGet package(s) that are missing on this computer. Enable NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}.</ErrorText>
</PropertyGroup>
<Error Condition="!Exists('..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets')" Text="$([System.String]::Format('$(ErrorText)', '..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets'))" />
<Error Condition="!Exists('$(SolutionDir)\.nuget\NuGet.targets')" Text="$([System.String]::Format('$(ErrorText)', '$(SolutionDir)\.nuget\NuGet.targets'))" />
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the NuGet package restore related lines (504 & 509).

</Target>
<PropertyGroup>
<PostBuildEvent>"$(SolutionDir)InsertIcons.exe" "$(TargetPath)" "$(ProjectDir)Images\NewEntry.ico"</PostBuildEvent>
</PropertyGroup>
<Import Project="$(SolutionDir)\.nuget\NuGet.targets" Condition="Exists('$(SolutionDir)\.nuget\NuGet.targets')" />
<!-- To modify your build process, add your task inside one of the targets below and uncomment it.
Other similar extension points exist, see Microsoft.Common.targets.
<Target Name="BeforeBuild">
Expand Down
114 changes: 114 additions & 0 deletions Journaley/Utilities/PostMarkdownParser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
namespace Journaley.Utilities
{
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

/// <summary>
/// Helper class for fixing MarkdownDeep parsed HTML.
/// </summary>
public class PostMarkdownParser
{
/// <summary>
/// Perform fixes after MarkdownDeep parsing for publishing
/// to Journaley.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to be more explicit here and say what it fixes.

  1. automatically add <br /> tags where the user hit enter.
  2. add strikethrough style support

/// </summary>
/// <param name="formattedString">Formatted HTML string</param>
/// <returns>Properly formatted HTML string for publishing.</returns>
public static string PostMarkdown(string formattedString)
{
// First convert all code blocks using <p> into <pre>
// which is used in almost all implementations of
// Markdown, even in the original Markdown specs.
formattedString = Regex.Replace(formattedString, @"<p><code>", "<pre><code>", RegexOptions.Multiline);
formattedString = Regex.Replace(formattedString, @"</code></p>", "</code></pre>", RegexOptions.Multiline);

System.Text.StringBuilder builder = new System.Text.StringBuilder();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the System.Text. part. You can just say StringBuilder because you already added using System.Text above.

System.Text.StringBuilder paragraphBuilder = new System.Text.StringBuilder();

var isNewLine = 0;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name isNewLine doesn't really explain what this variable does.
How about something like parseState?


string newLineString;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Declare this variable when its used in line 86. This makes it easier to understand the code, because the scope of this variable is actually within the else if (parseState == 0) block. It is fine for the other variables like builder, paragraphBuilder, and parseState, because their state persists over the loops.
  2. Let's rename this to paragraph, because it is a better name, it comes from the paragraphBuilder, and the name newLineString can be misleading because it usually refers to \r\n or some other platform dependent newline string.

string strippedOverString;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strippedOverString is declared here and then not used until the end of the method. Either declare this when it's actually used, or remove this variable.


var lines = formattedString.Split(
new string[] { "\r\n", "\n" },
StringSplitOptions.None);

for (int i = 0; i < lines.Length - 1; ++i)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're following the pattern I used for removing <br /> tags before, but this loop is not appropriate here.

Just use foreach (string line in lines) instead, and remove string line = lines[i];, and remove builder.AppendLine(lines.Last()); in line 108.

As it stands now, this loop doesn't run for the last line, and for example, if there was a strikethrough markup in the last line, it wouldn't be applied correctly.

{
string line = lines[i];

// We first incrementally check if the line begins with <p> tag. When it does,
// We turn on the paragraphBuilder to append the remaining lines.

// If the checks hit a line with the </p> closing, we disable check and proceed
// to parse the text, add <br /> tags and then adding it to builder
// where it is assembled with the rest of the file.

// We dump the text to builder if it hits a <pre> tag.
if (line.Contains("<p>"))
{
isNewLine = 1;
}
else if (line.Contains("</p>"))
{
isNewLine = 0;
}
else if (line.Contains("<pre>"))
{
isNewLine = -1;
}

if (isNewLine == 1)
{
paragraphBuilder.AppendLine(line);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not appropriate to add this line here, because it won't be used and immediately cleared when this line also contains </p>. Let's add it in the else clause of the following if block.


if (line.Contains("</p>"))
{
// Single line strikethroughs.
line = Regex.Replace(line, @"(~~)(.*?)(~~)", "<del>$2</del>", RegexOptions.Multiline);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the RegexOptions.Multiline option necessary? I think the line should always be a single line at this point.

builder.AppendLine(line);
paragraphBuilder.Clear();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is no longer needed.

isNewLine = 0;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an else clause like this:

else
{
    paragraphBuilder.Add(line + "<br />");
}

I thought about this for a while, and I think this is the perfect place to add <br /> tags.
This is the place where we're within the <p> tag, and seeing the line break before the </p> tag.
I don't think we need that complex regex below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you thought this after I clarified to you about how the regex worked right?

}
else if (isNewLine == 0)
{
paragraphBuilder.AppendLine(line);

newLineString = paragraphBuilder.ToString();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this variable to paragraph as I suggested above.


// Multiline strikethrough.
if (Regex.IsMatch(newLineString, @"(~~)(.*?)(~~)"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain me why you need this if statement? Isn't it enough to just have the first Regex.Replace statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right about that.

I actually assumed that problem, because supposedly in Day one you can do this:

~~Nikki~~
Nikki
~~Nikki~~

And it parses it correctly like:
Nikki
Nikki
Nikki

But in the previous code, it parsed like:

Nikki Nikki
Nikki

That's why I added an regex check, but I think because of the foreach, it might have fixed it.

{
newLineString = Regex.Replace(newLineString, @"(~~)(.*?)(~~)", "<del>$2</del>", RegexOptions.Multiline);
}
else
{
newLineString = Regex.Replace(newLineString, @"<p>(~~)", "<p><del>", RegexOptions.Multiline);
newLineString = Regex.Replace(newLineString, @"(~~)</p>", "</del></p>", RegexOptions.Multiline);
}

newLineString = Regex.Replace(newLineString, @"^([\w\*\>\<\[][^\r\n]*)(?=\r?\n[\w\*\>\<\[].*$)", "$1<br />", RegexOptions.Multiline);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really understand this regex. Can you explain me what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually a proposed fix from a MarkdownDeep PR. I actually tested it on a regex tester, it does check for only line feeds in the end and at the same time disregards the end of the string.

In test:

<p>Line breaks are fun
Right?
It depends</p>

Only the first line and the next line are added with the <br> tag and not the last one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this code, now that we add the <br /> tags above.


builder.AppendLine(newLineString);
paragraphBuilder.Clear();
}
else
{
// For <pre> tagged text, just skip check and append to builder
builder.AppendLine(line);
}
}

builder.AppendLine(lines.Last());
strippedOverString = Regex.Replace(builder.ToString(), @"^(\r\s)", string.Empty, RegexOptions.Multiline);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this line is needed either. Please explain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output in the end is actually a bit messy. Has a lot of line breaks. That's why I added a check for the carriage return + whitespace so that the output HTML string won't be messy.


return strippedOverString;
}
}
}
2 changes: 1 addition & 1 deletion Journaley/packages.config
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="DeltaCompressionDotNet" version="1.0.0" targetFramework="net45" />
<package id="MarkdownSharp" version="1.13.0.0" targetFramework="net40-Client" />
<package id="MarkdownDeep.NET" version="1.5" targetFramework="net45" />
<package id="Splat" version="1.6.2" targetFramework="net45" />
<package id="squirrel.windows" version="0.99.2" targetFramework="net45" />
<package id="StyleCop.MSBuild" version="4.7.48.2" targetFramework="net40-Client" />
Expand Down
4 changes: 4 additions & 0 deletions VersionExtractor/VersionExtractor.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
<AssemblyName>VersionExtractor</AssemblyName>
<TargetFrameworkVersion>v4.5</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment>
<SolutionDir Condition="$(SolutionDir) == '' Or $(SolutionDir) == '*Undefined*'">..\</SolutionDir>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this entire file.

<RestorePackages>true</RestorePackages>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<PlatformTarget>AnyCPU</PlatformTarget>
Expand Down Expand Up @@ -55,7 +57,9 @@
<ErrorText>This project references NuGet package(s) that are missing on this computer. Enable NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}.</ErrorText>
</PropertyGroup>
<Error Condition="!Exists('..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets')" Text="$([System.String]::Format('$(ErrorText)', '..\packages\StyleCop.MSBuild.4.7.48.2\build\StyleCop.MSBuild.Targets'))" />
<Error Condition="!Exists('$(SolutionDir)\.nuget\NuGet.targets')" Text="$([System.String]::Format('$(ErrorText)', '$(SolutionDir)\.nuget\NuGet.targets'))" />
</Target>
<Import Project="$(SolutionDir)\.nuget\NuGet.targets" Condition="Exists('$(SolutionDir)\.nuget\NuGet.targets')" />
<!-- To modify your build process, add your task inside one of the targets below and uncomment it.
Other similar extension points exist, see Microsoft.Common.targets.
<Target Name="BeforeBuild">
Expand Down