(control) Add warnings about domain data contamination
This commit is contained in:
parent
0b105b5986
commit
182c0cf28e
@ -1,8 +1,15 @@
|
||||
<h1 class="my-3">Download Sample Data</h1>
|
||||
|
||||
<div class="my-3 p-3 border bg-light">
|
||||
This will download sample crawl data from <a href="https://downloads.marginalia.nu">downloads.marginalia.nu</a> onto Node {{node.id}}.
|
||||
<p>This will download sample crawl data from <a href="https://downloads.marginalia.nu">downloads.marginalia.nu</a> onto Node {{node.id}}.
|
||||
This is a sample of real crawl data. It is intended for demo, testing and development purposes. Several sets are available.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
<span class="text-danger">Warning</span> While processing the sample data, the domains associated with it will be loaded
|
||||
into the domain database. This means that if you run the re-crawl action on this machine, regardless of which crawl data
|
||||
is specified, the domains in the sample data will be crawled!
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<form method="post" action="actions/download-sample-data">
|
||||
|
@ -6,6 +6,9 @@
|
||||
If you are just looking to test the software, feel free to use <a href="https://downloads.marginalia.nu/domain-list-test.txt">this
|
||||
short list of marginalia-related websites</a>, that are safe to crawl repeatedly without causing any problems.
|
||||
</p>
|
||||
|
||||
<p><span class="text-danger">Warning</span> Ensure <a href="?view=download-sample-data">downloaded sample data</a> has not been loaded onto this instance
|
||||
before performing this action, otherwise those domains will also be crawled while re-crawling in the future!</p>
|
||||
</div>
|
||||
|
||||
<form method="post" action="actions/new-crawl-specs">
|
||||
|
@ -18,6 +18,8 @@
|
||||
crawl spec. If the document has changed, it will be re-crawled. If it has not changed, it will be skipped,
|
||||
and the previous data will be retained. This is both faster and easier on the target server.
|
||||
</p>
|
||||
<p><span class="text-danger">Warning</span> Ensure <a href="?view=download-sample-data">downloaded sample data</a>
|
||||
has not been loaded onto this instance before performing this action, otherwise those domains will also be crawled!</p>
|
||||
</div>
|
||||
|
||||
<form method="post" action="actions/recrawl">
|
||||
|
Loading…
Reference in New Issue
Block a user