Efficient Algorithms for Destroyed Document ReconstructionStereo images rectification and disparity: which algorithms?Which algorithms are usable for heatmaps and what are their pros and consEfficient flood filling (seed filling)How to test image segmentation algorithms?Viewing images that compressed using lossless algorithmsSources on dictionary learning and related algorithmsHigh Dimensional Spaces for ImagesImage registration algorithms for images with varying distancesAlgorithms to correct misspelled word?More Efficient Feature Method Than Haar-Feature For Face Detection
How do I write real-world stories separate from my country of origin?
Proto-Indo-European (PIE) words with IPA
Are there historical examples of audiences drawn to a work that was "so bad it's good"?
Is there a solution to paying high fees when opening and closing lightning channels once we hit a fee only market?
Does science define life as "beginning at conception"?
Does attacking (or having a rider attack) cancel Charge/Pounce-like abilities?
amsmath: How can I use the equation numbering and label manually and anywhere?
Gas chromatography flame ionization detector (FID) - why hydrogen gas?
A nasty indefinite integral
Can a UK national work as a paid shop assistant in the USA?
Why did Nick Fury not hesitate in blowing up the plane he thought was carrying a nuke?
One word for 'the thing that attracts me'?
How to create razor wire
Find this Unique UVC Palindrome ( ignoring signs and decimal) from Given Fractional Relationship
Surface of the 3x3x3 cube as a graph
Adobe Illustrator: How can I change the profile of a dashed stroke?
Caught with my phone during an exam
Can someone get a spouse off a deed that never lived together and was incarcerated?
Why is this python script running in background consuming 100 % CPU?
Keeping the dodos out of the field
Must every right-inverse of a linear transformation be a linear transformation?
Are clauses with "который" restrictive or non-restrictive by default?
Make the `diff` command look only for differences from a specified range of lines
How could the B-29 bomber back up under its own power?
Efficient Algorithms for Destroyed Document Reconstruction
Stereo images rectification and disparity: which algorithms?Which algorithms are usable for heatmaps and what are their pros and consEfficient flood filling (seed filling)How to test image segmentation algorithms?Viewing images that compressed using lossless algorithmsSources on dictionary learning and related algorithmsHigh Dimensional Spaces for ImagesImage registration algorithms for images with varying distancesAlgorithms to correct misspelled word?More Efficient Feature Method Than Haar-Feature For Face Detection
$begingroup$
I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.
I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.
Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.
image-processing
New contributor
$endgroup$
add a comment |
$begingroup$
I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.
I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.
Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.
image-processing
New contributor
$endgroup$
1
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
1
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago
add a comment |
$begingroup$
I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.
I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.
Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.
image-processing
New contributor
$endgroup$
I am not certain this is the proper site for this question however I am mainly looking for resources on this topic (perhaps code). I was watching TV and one of the characters had a lawyer who destroyed his documents using a paper shredder. A lab tech said that the shredder was special.
I am not familiar with this area of computer science/ mathematics but I am looking for information on efficient algorithms to reconstruct destroyed documents. I can come up with a naive approach that is brute force fairly easily I imagine but just going through all the pieces and looking for edges that are the same but this doesn't sound feasible as the number of combinations will explode.
Note: By destroyed documents I am talking about taking a document (printed out) and then shredding it into small pieces and reassembling it by determining which pieces fit together.
image-processing
image-processing
New contributor
New contributor
edited 2 hours ago
Shogun
New contributor
asked 3 hours ago
ShogunShogun
1085
1085
New contributor
New contributor
1
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
1
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago
add a comment |
1
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
1
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago
1
1
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
1
1
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.
For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.
The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.
To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.
By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.
$endgroup$
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still beO(n!)
$endgroup$
– Alexander
14 mins ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "419"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Shogun is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f109567%2fefficient-algorithms-for-destroyed-document-reconstruction%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.
For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.
The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.
To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.
By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.
$endgroup$
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still beO(n!)
$endgroup$
– Alexander
14 mins ago
add a comment |
$begingroup$
Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.
For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.
The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.
To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.
By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.
$endgroup$
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still beO(n!)
$endgroup$
– Alexander
14 mins ago
add a comment |
$begingroup$
Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.
For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.
The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.
To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.
By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.
$endgroup$
Your problem is NP-Complete, even for strips (n strips yields (2n)!) used, so people use heuristics, transforms like Hough and morphological filters (to match continuity of text, but this heavily increases complexity for matching) or any kind of genetic / NN search, Ants Colony Optimization.
For summary of consecutive steps and various algorithm I recommend An Investigation into Automated Shredded Document Reconstruction using Heuristic Search Algorithms.
The problem itself may end up in nasty cases, when document is not fully sharp (blurred, printed with low resolution) and strips width is small and cut by physical cutter with dulled edges, because standard merging methods like panorama photo sticher gets lost and yield improper results. This is due to lost information by missing small strips, otherwise if you have full digital image cut into pieces, it is as hard as Jigsaw puzzle, non-digital image falls into approximate search.
To make algorithm automatic another problem is pieces feeding, rarely you can give axis aligned strips, so to start process it is nice to input all stripes as one picture with pieces lay by hand, this imposes another problem (this one is easy) to detect blobs and rotate them.
By special shredder instead of stripes yield very small rectangles. For comparison, P-1 class shredder gives stripes 6-12mm wide of any length (about 1800mm^2), class P-7 gives rectangles with area less than 5mm^2. When you get rectangles instead of stripes, problem yields (4n)! permutations, assuming one one-sided document, if there are lots of shreds from unrelated documents (no pictures, text only) in one bag, problem is not really tractable.
answered 2 hours ago
EvilEvil
8,47742447
8,47742447
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still beO(n!)
$endgroup$
– Alexander
14 mins ago
add a comment |
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still beO(n!)
$endgroup$
– Alexander
14 mins ago
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be
O(n!)
$endgroup$
– Alexander
14 mins ago
$begingroup$
There may be (2n)! arrangements of the shredded strips, but does that still determine the time complexity? Whenever you find "matches" you can "group" them together, behaving as a "thick strip", where only the first and last edge matter for the sake of comparison against other strips. This "clumping" should reduce the search space hugely, but IDK if it will still be
O(n!)
$endgroup$
– Alexander
14 mins ago
add a comment |
Shogun is a new contributor. Be nice, and check out our Code of Conduct.
Shogun is a new contributor. Be nice, and check out our Code of Conduct.
Shogun is a new contributor. Be nice, and check out our Code of Conduct.
Shogun is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Computer Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f109567%2fefficient-algorithms-for-destroyed-document-reconstruction%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
Can you edit your question to define "destroyed documents"?
$endgroup$
– lox
2 hours ago
1
$begingroup$
You should look at the methods used to recover the Stasi (East German secret police) archives that were shredded or mostly -- oops all the shredders are broken from over use -- torn up after the fall of the Berlin Wall. The BBC has a very high-level summary.
$endgroup$
– David Richerby
2 hours ago