[동적계획법테크닉] 3. 광학 문자 인식

컨셉

A라고 인식한 단어가 B일 확률 == B인데 A라고 인식할 확률.

- 이유: 다음 표를 보자. (X → Y : X인데 Y로 인식할 확률)

X(실제)＼Y(인식)	A (인식)	B	C
A (실제)	0.5	0.2	0.3
B	0.1	0.7	0.2
C	0.3	0.0	0.7

예) 실제 A인데 B로 인식할 확률 = 0.2 / B로 인식했는데 실제론 A일 확률 = 0.2

실제론 X인데 Y로 인식할 확률(X를 보고 →Y찾기)이나, Y로 인식했는데 실제론 X일 확률이나(Y를 보고 X를 찾기) 같다.

OCR(index, real) //인식한 index번째 단어가 사실은 단어 real일 확률

OCR(index, real) = ∑ OCR(index-1, 단어 i) * (단어 i 다음에 단어 real이 올 확률) * (index번째로 온 단어를 real로 인식할 확률)

초안

내가 쓴 코드(오답, 328ms) ... 예제는 다 맞는데 ㅠㅠ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

/**
 * @name    9.4. 광학 문자 인식
 * @ID      OCR
 * @date    2019-08-20
 * @author  Sunmon
 */
 
#include <iostream>
#include <cstring>
#include <algorithm>        //std::find()
#include <vector>
using namespace std;
 
int M, N, Q;                    //M: 단어의 수, N: 인식한 단어의 수 Q: 문장의 수
string words[500];              //주어진 단어 저장
double firstWord[500];          //firstWord[i]: i번째 단어가 문장 맨 처음에 출현할 확률
double nextWord[500][500];      //nextWord[i,j]: words[i]다음 words[j]가 올 확률
double processWord[500][500];   //processWord[i,j]: words[i]인데 [j]로 분류할 확률 (실제:i, 인식:j)
string results[100];            //results[i]: 인식한 문장의 i번째 단어
double cache[500][500];         //cache[i][j]: results[i]가 사실 words[j]일 확률 저장
vector<string> answer;
 
 
//인식한 index번째 단어 results[index]가 사실 words[real]일 확률 리턴
double OCR(int index, int real)
{
    //results[index] == words[recognize]인 recognize를 찾는다
    int recognize = find(words, words+N, results[index]) - words;
    if(index == 0) return firstWord[recognize] * processWord[recognize][real];
 
    double& ret = cache[index][real];
    if(ret != -1) return ret;
 
    ret = 0;
 
    //앞 단어에 따른 현재 단어의 확률을 결정한다
    for(int i = 0; i<N; i++)
        ret += (OCR(index -1, i) * nextWord[i][real]) * processWord[recognize][real];
 
    return ret;
}
 
//OCR을 보정한 결과를 저장한다
void correctOCR(int index)
{
    //base case
    if(index == N) return;
 
    //index번째로 올 수 있는 단어 중 가장 높은 가능성을 가진 단어의 인덱스와 확률 저장
    int real = 0;
    double probability = 0;
 
    for(int i = 0; i<N; i++)
    {
        if(probability > OCR(index, i)) continue;
        real = i;
        probability = OCR(index, i);
    }
    answer.push_back(words[real]);
    correctOCR(index+1);
}
 
 
int main()
{
    //캐시 초기화
    fill_n(cache[0], 500*500, -1);
    
    //입력받기
    cin >> M >> Q;
    for(int i = 0; i< M; i++) cin >> words[i];
    for(int i = 0; i<M; i++) cin >> firstWord[i];
    for(int i = 0; i<M; i++)
        for(int j = 0; j<M; j++) cin >> nextWord[i][j];
    for(int i = 0; i<M; i++)
        for(int j = 0; j<M; j++) cin >> processWord[i][j];
 
    for(int i = 0; i<Q; i++)
    {
        answer = {};
        cin >> N;
        for(int j = 0; j < N; j++) cin >> results[j];
        correctOCR(0);
 
        for(auto& a: answer) cout << a << " ";
        cout << endl;
    }
}
Colored by Color Scripter

cs

재안

내용

주의

- std::find() //#include <algorithm>

iterator find(시작 interator, 끝 iterator, 찾을 대상) //대상을 찾아서 포인터를 반환. 못 찾으면 vector.end() 반환

(사실 `size_t`를 반환하는데 편하게 포인터라고 생각하자)

1
2
3
4
5
6
7
8

#include <algorithm>
string words[10];
string *ptr;
ptr = std::find(words, words + 10, "A");
 
vector<string> v;
vector<string>::iterator it;
it = std::find(v.begin(), v.end(), "A");

cs

참고

- std::find: http://www.cplusplus.com/reference/algorithm/find/

find - C++ Reference

123456789 template InputIterator find (InputIterator first, InputIterator last, const T& val) { while (first!=last) { if (*first==val) return first; ++first; } return last; }

www.cplusplus.com

저작자표시 비영리 변경금지

'Study > Algorithm' 카테고리의 다른 글

코 드 조 아 (0)	2019.11.17
[2017 카카오 예선] 보행자 천국 (0)	2019.09.04
[동적계획법 테크닉] 2. 여행 짐 싸기 (0)	2019.08.16
[동적계획법] 13. 두니발 박사의 탈옥 - 오답 (0)	2019.08.16
[동적계획법] 12. 폴리오미노 (0)	2019.08.13

에움길

[동적계획법테크닉] 3. 광학 문자 인식

컨셉

초안

재안

주의

참고

'Study > Algorithm' 카테고리의 다른 글

티스토리툴바

[동적계획법테크닉] 3. 광학 문자 인식

컨셉

초안

재안

주의

참고

'Study > Algorithm' 카테고리의 다른 글

'Study/Algorithm' Related Articles

티스토리툴바