Kavita/API/Services/MetadataService.cs
Joseph Milazzo 33db123e81
v0.4.8 Release (#720)
* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Workflow updates (#658)

# Added
- Added: Added automatic character parsing for discord notifier. Now if the PR is over a certain character limit, it will trim and add an appropriate link to the full changelog. (Release for Stable, PR for Dev)

# Removed
- Removed: Removed Sentry map task from the workflow since Sentry is no longer used.

* Bump versions by dotnet-bump-version.

* Misc Updates (#665)

* Do not allow non-admins to change their passwords when authentication is disabled

* Clean up the login page so that input field text is black

* cleanup some resizing when typing a password and having a lot of users

* Changed the LastActive for a user to not just be login, but also when they open an already authenticated session.

* Bump versions by dotnet-bump-version.

* Logging Cleanup (#668)

* Do not allow non-admins to change their passwords when authentication is disabled

* Clean up the login page so that input field text is black

* cleanup some resizing when typing a password and having a lot of users

* Changed the LastActive for a user to not just be login, but also when they open an already authenticated session.

* Removed some verbose debugging statements and moved some debug to information to be more prevelant to logs for default installs.

* In Progress now sends progress information on the Series

* Add ability to add cards to recently added when new series are added in backend

* Implemented the ability to click the glasses icon to turn off incognito mode from within the reader so you can start tracking progress

* Don't warn the user about authentication when they don't touch that control

* Bump versions by dotnet-bump-version.

* Changed the stats that are sent back to stat server from installed server.

* Revert "Changed the stats that are sent back to stat server from installed server."

This reverts commit 644cb6d1f6.

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Bulk Add to Collection (#674)

* Fixed the typeahead not having the same size input box as other inputs

* Implemented the ability to add multiple series to a collection through bulk operations flow. Updated book parser to handle "@import url('...');" syntax as well as @import '...';

* Implemented the ability to create a new Collection tag via bulk operations flow.

* Bump versions by dotnet-bump-version.

* Bulk Operations for In Progress and Recently Added (#677)

* Don't log a message about bad match if the file is a cover image

* Enable bulk operations for In Progress and Recently Added

* Fixed a bad logic case

* Bump versions by dotnet-bump-version.

* Regression Fix (#680)

* Ensure we mount the backups directory for Docker users

* Fixed a huge logic bug that deleted files in users libraries

* Bump versions by dotnet-bump-version.

* Change chunk size to be a fixed 50 to validate if it's causing issue with refresh. Added some try catches to see if exceptions are causing issues. (#681)

* Bump versions by dotnet-bump-version.

* Fixed a bug where searching on localized name would fail to show on the search. Fixed a bug where extra spaces would cause the search results not to show properly. (#682)

* Bump versions by dotnet-bump-version.

* When we have a special marker, ensure we fall back to folder parsing to try and group correctly to the actual series before just accepting what we parsed. (#684)

Fixed a missed parsing case where comic special parsing wasn't being called on comic libraries.

* Bump versions by dotnet-bump-version.

* iOS Admin page dropdown fix (#686)

# Fixed:
- Fixed: Fixed an issue where the dropdown on the admin server page would not work on Safari or other iOS browsers.

* When the DB fails to save, log out all the series the user should look into for constraint issues and push a message to the admins connected to webui. (#687)

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Stat upload will now schedule itself between midnight and 6am in server time for upload. (#688)

* Bump versions by dotnet-bump-version.

* EPUB CSS Parsing Issues (#690)

* WIP. Rewrote some of the Regex to better support css escaping. We now escape background-image, border-image, and list-style-image within css files.

* Added position relative to help with positioning on books that are just absolute positioned elements.

* When there is absolute positioning, like in some epub based comics, supress the bottom action bar since it wont render in the correct location.

* Fixed tests

* Commented out tests

* Bump versions by dotnet-bump-version.

* More EPUB Scoping Fixes (#691)

* Added better handling around when importing css files that are empty. Moved comment removal on css files to before some css whitespace cleanup to get better matches.

* Some enhancements on the checks to see if we need the bottom action bar on reader. Now we don't query DOM and have something that works more reliably.

* Bump versions by dotnet-bump-version.

* Fixed an issue where docker users were not properly backing up the database. Removed an empty File for when covers/ had nothing in it. (#692)

* Bump versions by dotnet-bump-version.

* Fallback to Folder Parsing Issue (#694)

* Fixed a bug in the scanner where we fall back to parsing from folders for poorly named files. The code was exiting early if a chapter or volume could be parsed out.

* Fixed a unit test by tweaking a regex for fallback

* Bump versions by dotnet-bump-version.

* KavitaStats Cleanup (#695)

* Refactored Stats code to be much cleaner and user better naming.

* Cleaned up the actual http code to use Flurl and to return if the upload was successful or not so we can delete the file where appropriate.

* More refactoring for the stats code to clean it up and keep it consistent with our standards.

* Removed a confusing log statement

* Added support for old api key header from original stat server

* Use the correct endpoint, not the new one.

* Code smell

* Bump versions by dotnet-bump-version.

* Bulk Deletion (#697)

* Implemented bulk deletion of series

* Don't show unauthorized exception on UI, just redirect to the login page.

* Bump versions by dotnet-bump-version.

* Cover Image Picking + Forwarding Headers with EPUBs (#700)

* Ensure Kavita knows about forwarding headers (fixes issue with epub urls not going through https with reverse proxy). Fixed a case where cover image selection preferred nested folders vs files in root directory.

* Fixed broken unit test

* Added bug that I fixed to the unit tests

* Cover Image Picking + Forwarding Headers with EPUBs (#702)

* Updating GA Bump version temporarily for fix (#703)

* Bump versions by dotnet-bump-version.

* Cover Image Picking + Forwarding Headers with EPUBs (GA Fix) (#704)

* Bump versions by dotnet-bump-version.

* Vacation Fixes (#709)

* Ignore system and hidden folders when performing directory scan.

* Fixed the comic parser tests not using Comic mode for parsing.

* Accept all forwarded headers and use them.

* Ignore some changes from another branch

* Bump versions by dotnet-bump-version.

* Breaking Changes: Docker Parity (#698)

* Refactored all the config files for Kavita to be loaded from config/. This will allow docker to just mount one folder and for Update functionality to be trivial.

* Cleaned up documentation around new update method.

* Updated docker files to support config directory

* Removed entrypoint, no longer needed

* Update appsettings to point to config directory for logs

* Updated message for docker users that are upgrading

* Ensure that docker users that have not updated their mount points from upgrade cannot start the server

* Code smells

* More cleanup

* Added entrypoint to fix bind mount issues

* Updated README with new folder structure

* Fixed build system for new setup

* Updated string path if user is docker

* Updated the migration flow for docker to work properly and Fixed LogFile configuration updating.

* Migrating docker images is now working 100%

* Fixed config from bad code

* Code cleanup

Co-authored-by: Chris Plaatjes <kizaing@gmail.com>

* Bump versions by dotnet-bump-version.

* Feature/docker parity (#714)

* Refactored all the config files for Kavita to be loaded from config/. This will allow docker to just mount one folder and for Update functionality to be trivial.

* Cleaned up documentation around new update method.

* Updated docker files to support config directory

* Removed entrypoint, no longer needed

* Update appsettings to point to config directory for logs

* Updated message for docker users that are upgrading

* Ensure that docker users that have not updated their mount points from upgrade cannot start the server

* Code smells

* More cleanup

* Added entrypoint to fix bind mount issues

* Updated README with new folder structure

* Fixed build system for new setup

* Updated string path if user is docker

* Updated the migration flow for docker to work properly and Fixed LogFile configuration updating.

* Migrating docker images is now working 100%

* Fixed config from bad code

* Code cleanup

* Fixed monorepo-build.sh

Co-authored-by: Chris Plaatjes <kizaing@gmail.com>

* Breaking Changes: Docker Parity (#715)

* Fixed a bug in the copy directory to directory in the migration

* Somehow GetFiles lost static modifier.

* Bump versions by dotnet-bump-version.

* Build issue (#716)

* Fixed a bug in the copy directory to directory in the migration

* Somehow GetFiles lost static modifier.

* Please work

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Shakeout Changes (#717)

* Make the appsettings public on Configuration and change how we detect when to migrate for non-docker users.

* Fixed up non-docker copy command and removed duplicate check on source directory for a copy.

* Don't delete files unless we know we are successful

* Bump versions by dotnet-bump-version.

* Fixed a migration issue on docker happening too many times or throwing exception when source wasn't there. (#719)

* Bump versions by dotnet-bump-version.

* Version bump for release (#718)

* Bump versions by dotnet-bump-version.

Co-authored-by: Robbie Davis <robbie@therobbiedavis.com>
Co-authored-by: YEGCSharpDev <89283498+YEGCSharpDev@users.noreply.github.com>
Co-authored-by: Chris Plaatjes <kizaing@gmail.com>
2021-11-04 05:29:02 -07:00

322 lines
15 KiB
C#

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using API.Comparators;
using API.Data.Metadata;
using API.Data.Repositories;
using API.Entities;
using API.Entities.Enums;
using API.Extensions;
using API.Helpers;
using API.Interfaces;
using API.Interfaces.Services;
using API.SignalR;
using Microsoft.AspNetCore.SignalR;
using Microsoft.Extensions.Logging;
namespace API.Services
{
public class MetadataService : IMetadataService
{
private readonly IUnitOfWork _unitOfWork;
private readonly ILogger<MetadataService> _logger;
private readonly IArchiveService _archiveService;
private readonly IBookService _bookService;
private readonly IImageService _imageService;
private readonly IHubContext<MessageHub> _messageHub;
private readonly ChapterSortComparerZeroFirst _chapterSortComparerForInChapterSorting = new ChapterSortComparerZeroFirst();
public MetadataService(IUnitOfWork unitOfWork, ILogger<MetadataService> logger,
IArchiveService archiveService, IBookService bookService, IImageService imageService, IHubContext<MessageHub> messageHub)
{
_unitOfWork = unitOfWork;
_logger = logger;
_archiveService = archiveService;
_bookService = bookService;
_imageService = imageService;
_messageHub = messageHub;
}
/// <summary>
/// Determines whether an entity should regenerate cover image.
/// </summary>
/// <remarks>If a cover image is locked but the underlying file has been deleted, this will allow regenerating. </remarks>
/// <param name="coverImage"></param>
/// <param name="firstFile"></param>
/// <param name="forceUpdate"></param>
/// <param name="isCoverLocked"></param>
/// <param name="coverImageDirectory">Directory where cover images are. Defaults to <see cref="DirectoryService.CoverImageDirectory"/></param>
/// <returns></returns>
public static bool ShouldUpdateCoverImage(string coverImage, MangaFile firstFile, bool forceUpdate = false,
bool isCoverLocked = false, string coverImageDirectory = null)
{
if (string.IsNullOrEmpty(coverImageDirectory))
{
coverImageDirectory = DirectoryService.CoverImageDirectory;
}
var fileExists = File.Exists(Path.Join(coverImageDirectory, coverImage));
if (isCoverLocked && fileExists) return false;
if (forceUpdate) return true;
return (firstFile != null && firstFile.HasFileBeenModified()) || !HasCoverImage(coverImage, fileExists);
}
private static bool HasCoverImage(string coverImage)
{
return HasCoverImage(coverImage, File.Exists(coverImage));
}
private static bool HasCoverImage(string coverImage, bool fileExists)
{
return !string.IsNullOrEmpty(coverImage) && fileExists;
}
private string GetCoverImage(MangaFile file, int volumeId, int chapterId)
{
file.UpdateLastModified();
switch (file.Format)
{
case MangaFormat.Pdf:
case MangaFormat.Epub:
return _bookService.GetCoverImage(file.FilePath, ImageService.GetChapterFormat(chapterId, volumeId));
case MangaFormat.Image:
var coverImage = _imageService.GetCoverFile(file);
return _imageService.GetCoverImage(coverImage, ImageService.GetChapterFormat(chapterId, volumeId));
case MangaFormat.Archive:
return _archiveService.GetCoverImage(file.FilePath, ImageService.GetChapterFormat(chapterId, volumeId));
default:
return string.Empty;
}
}
/// <summary>
/// Updates the metadata for a Chapter
/// </summary>
/// <param name="chapter"></param>
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
public bool UpdateMetadata(Chapter chapter, bool forceUpdate)
{
var firstFile = chapter.Files.OrderBy(x => x.Chapter).FirstOrDefault();
if (ShouldUpdateCoverImage(chapter.CoverImage, firstFile, forceUpdate, chapter.CoverImageLocked))
{
_logger.LogDebug("[MetadataService] Generating cover image for {File}", firstFile?.FilePath);
chapter.CoverImage = GetCoverImage(firstFile, chapter.VolumeId, chapter.Id);
return true;
}
return false;
}
/// <summary>
/// Updates the metadata for a Volume
/// </summary>
/// <param name="volume"></param>
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
public bool UpdateMetadata(Volume volume, bool forceUpdate)
{
// We need to check if Volume coverImage matches first chapters if forceUpdate is false
if (volume == null || !ShouldUpdateCoverImage(volume.CoverImage, null, forceUpdate)) return false;
volume.Chapters ??= new List<Chapter>();
var firstChapter = volume.Chapters.OrderBy(x => double.Parse(x.Number), _chapterSortComparerForInChapterSorting).FirstOrDefault();
if (firstChapter == null) return false;
volume.CoverImage = firstChapter.CoverImage;
return true;
}
/// <summary>
/// Updates metadata for Series
/// </summary>
/// <param name="series"></param>
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
public bool UpdateMetadata(Series series, bool forceUpdate)
{
var madeUpdate = false;
if (series == null) return false;
// NOTE: This will fail if we replace the cover of the first volume on a first scan. Because the series will already have a cover image
if (ShouldUpdateCoverImage(series.CoverImage, null, forceUpdate, series.CoverImageLocked))
{
series.Volumes ??= new List<Volume>();
var firstCover = series.Volumes.GetCoverImage(series.Format);
string coverImage = null;
if (firstCover == null && series.Volumes.Any())
{
// If firstCover is null and one volume, the whole series is Chapters under Vol 0.
if (series.Volumes.Count == 1)
{
coverImage = series.Volumes[0].Chapters.OrderBy(c => double.Parse(c.Number), _chapterSortComparerForInChapterSorting)
.FirstOrDefault(c => !c.IsSpecial)?.CoverImage;
madeUpdate = true;
}
if (!HasCoverImage(coverImage))
{
coverImage = series.Volumes[0].Chapters.OrderBy(c => double.Parse(c.Number), _chapterSortComparerForInChapterSorting)
.FirstOrDefault()?.CoverImage;
madeUpdate = true;
}
}
series.CoverImage = firstCover?.CoverImage ?? coverImage;
}
return UpdateSeriesSummary(series, forceUpdate) || madeUpdate ;
}
private bool UpdateSeriesSummary(Series series, bool forceUpdate)
{
// NOTE: This can be problematic when the file changes and a summary already exists, but it is likely
// better to let the user kick off a refresh metadata on an individual Series than having overhead of
// checking File last write time.
if (!string.IsNullOrEmpty(series.Summary) && !forceUpdate) return false;
var isBook = series.Library.Type == LibraryType.Book;
var firstVolume = series.Volumes.FirstWithChapters(isBook);
var firstChapter = firstVolume?.Chapters.GetFirstChapterWithFiles();
var firstFile = firstChapter?.Files.FirstOrDefault();
if (firstFile == null || (!forceUpdate && !firstFile.HasFileBeenModified())) return false;
if (Parser.Parser.IsPdf(firstFile.FilePath)) return false;
var comicInfo = GetComicInfo(series.Format, firstFile);
if (string.IsNullOrEmpty(comicInfo?.Summary)) return false;
series.Summary = comicInfo.Summary;
return true;
}
private ComicInfo GetComicInfo(MangaFormat format, MangaFile firstFile)
{
if (format is MangaFormat.Archive or MangaFormat.Epub)
{
return Parser.Parser.IsEpub(firstFile.FilePath) ? _bookService.GetComicInfo(firstFile.FilePath) : _archiveService.GetComicInfo(firstFile.FilePath);
}
return null;
}
/// <summary>
/// Refreshes Metadata for a whole library
/// </summary>
/// <remarks>This can be heavy on memory first run</remarks>
/// <param name="libraryId"></param>
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
public async Task RefreshMetadata(int libraryId, bool forceUpdate = false)
{
var library = await _unitOfWork.LibraryRepository.GetLibraryForIdAsync(libraryId, LibraryIncludes.None);
_logger.LogInformation("[MetadataService] Beginning metadata refresh of {LibraryName}", library.Name);
var chunkInfo = await _unitOfWork.SeriesRepository.GetChunkInfo(library.Id);
var stopwatch = Stopwatch.StartNew();
var totalTime = 0L;
_logger.LogInformation("[MetadataService] Refreshing Library {LibraryName}. Total Items: {TotalSize}. Total Chunks: {TotalChunks} with {ChunkSize} size", library.Name, chunkInfo.TotalSize, chunkInfo.TotalChunks, chunkInfo.ChunkSize);
for (var chunk = 1; chunk <= chunkInfo.TotalChunks; chunk++)
{
if (chunkInfo.TotalChunks == 0) continue;
totalTime += stopwatch.ElapsedMilliseconds;
stopwatch.Restart();
_logger.LogInformation("[MetadataService] Processing chunk {ChunkNumber} / {TotalChunks} with size {ChunkSize}. Series ({SeriesStart} - {SeriesEnd}",
chunk, chunkInfo.TotalChunks, chunkInfo.ChunkSize, chunk * chunkInfo.ChunkSize, (chunk + 1) * chunkInfo.ChunkSize);
var nonLibrarySeries = await _unitOfWork.SeriesRepository.GetFullSeriesForLibraryIdAsync(library.Id,
new UserParams()
{
PageNumber = chunk,
PageSize = chunkInfo.ChunkSize
});
_logger.LogDebug("[MetadataService] Fetched {SeriesCount} series for refresh", nonLibrarySeries.Count);
Parallel.ForEach(nonLibrarySeries, series =>
{
try
{
_logger.LogDebug("[MetadataService] Processing series {SeriesName}", series.OriginalName);
var volumeUpdated = false;
foreach (var volume in series.Volumes)
{
var chapterUpdated = false;
foreach (var chapter in volume.Chapters)
{
chapterUpdated = UpdateMetadata(chapter, forceUpdate);
}
volumeUpdated = UpdateMetadata(volume, chapterUpdated || forceUpdate);
}
UpdateMetadata(series, volumeUpdated || forceUpdate);
}
catch (Exception)
{
/* Swallow exception */
}
});
if (_unitOfWork.HasChanges() && await _unitOfWork.CommitAsync())
{
_logger.LogInformation(
"[MetadataService] Processed {SeriesStart} - {SeriesEnd} out of {TotalSeries} series in {ElapsedScanTime} milliseconds for {LibraryName}",
chunk * chunkInfo.ChunkSize, (chunk * chunkInfo.ChunkSize) + nonLibrarySeries.Count, chunkInfo.TotalSize, stopwatch.ElapsedMilliseconds, library.Name);
foreach (var series in nonLibrarySeries)
{
await _messageHub.Clients.All.SendAsync(SignalREvents.RefreshMetadata, MessageFactory.RefreshMetadataEvent(library.Id, series.Id));
}
}
else
{
_logger.LogInformation(
"[MetadataService] Processed {SeriesStart} - {SeriesEnd} out of {TotalSeries} series in {ElapsedScanTime} milliseconds for {LibraryName}",
chunk * chunkInfo.ChunkSize, (chunk * chunkInfo.ChunkSize) + nonLibrarySeries.Count, chunkInfo.TotalSize, stopwatch.ElapsedMilliseconds, library.Name);
}
}
_logger.LogInformation("[MetadataService] Updated metadata for {SeriesNumber} series in library {LibraryName} in {ElapsedMilliseconds} milliseconds total", chunkInfo.TotalSize, library.Name, totalTime);
}
/// <summary>
/// Refreshes Metadata for a Series. Will always force updates.
/// </summary>
/// <param name="libraryId"></param>
/// <param name="seriesId"></param>
public async Task RefreshMetadataForSeries(int libraryId, int seriesId, bool forceUpdate = false)
{
var sw = Stopwatch.StartNew();
var series = await _unitOfWork.SeriesRepository.GetFullSeriesForSeriesIdAsync(seriesId);
if (series == null)
{
_logger.LogError("[MetadataService] Series {SeriesId} was not found on Library {LibraryId}", seriesId, libraryId);
return;
}
_logger.LogInformation("[MetadataService] Beginning metadata refresh of {SeriesName}", series.Name);
var volumeUpdated = false;
foreach (var volume in series.Volumes)
{
var chapterUpdated = false;
foreach (var chapter in volume.Chapters)
{
chapterUpdated = UpdateMetadata(chapter, forceUpdate);
}
volumeUpdated = UpdateMetadata(volume, chapterUpdated || forceUpdate);
}
UpdateMetadata(series, volumeUpdated || forceUpdate);
if (_unitOfWork.HasChanges() && await _unitOfWork.CommitAsync())
{
await _messageHub.Clients.All.SendAsync(SignalREvents.RefreshMetadata, MessageFactory.RefreshMetadataEvent(series.LibraryId, series.Id));
}
_logger.LogInformation("[MetadataService] Updated metadata for {SeriesName} in {ElapsedMilliseconds} milliseconds", series.Name, sw.ElapsedMilliseconds);
}
}
}